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A Reversible Stoichiometric Process 
for Conjugating Biomolecules 

Background of the Invention 

Methods for reversibly linking biomolecules (e.g. nucleic acids with 
reporter groups or to solid supports) is important for many applications in the life 
sciences; it is used amongst other applications in DNA sequencing, DNA diagnostics, 
nucleic acid purification, Polymerase and Ligase Chain Reactions (PCR, LCR), 
hybridization experiments and solid phase biochemistry. Most frequently, a reversible 
linkage is accomplished via a streptavidin-biotin interaction (L.G. Mitchel and C.R. 
Merril (1989) Anal Biochem., 128, 239-242; B.H. Bowman and S.R. Palumbi (1993) in 
E.A Zimmer, R.L. Cann and A.C. Wilson (ed.) Methods ofEnzymology, Academic 
Press, New York, Vol. 221 PP- 399-405; X. Tong and L.M. Smith (1992) Anal Chem. 
64, 2672-2677). 

Another reversible linkage, which is particularly amenable for linkage of 
nucleic acids, can be accomplished via heterobifiinctional trityl groups, which can be 
cleaved under acidic conditions (E. Leikauf, F. Barnekow and H. Koster, 
Heterobifiinctional Trityl Derivatives as Linking Agents for the Recovery of Nucleic 
Acids after Labeling and Immobilization (1995) Tetrahedron 51, 3793-3802; H. Koster, 
J.M. Coull and B. Gildea, Succinimidyl Trityl Compounds and a Process for Preparing 
Same, Protecting Groups for Natural Products, US Patent 5,410,068). 

The interaction of metal chelates with polypeptide sequences such as 
oligohistidine has been used for affinity chromatography of proteins (J. Porath (1992) 
Protein Express Purtf. 2, 263-281; M.C. Smith et al. (1988) 1 Biol Chem., 262, 721 1- 
7215; E. Hochuli and S. Piesecki (1992) Methods, 4, 68-72; E. Huchuli et al. (1988) 
BioTechnology tf, 1321-1325; E. Blum et al. (1994) Biochem. Biophys. J. 2S>, 113-121; 
see also European Patent No. 0 253 303 to Hoffinan LaRoche AG), nucleic acids (Ch. 
Min and G. L. Verdine, Immobilized Metal Affinity Chromatography of DNA (1996) 
Nucleic Acids Res. 24, 3806-3810) and recently a system to detect proteins has been 
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introduced (Qiagcn (1996): QIAexpress Detection System). Occasionally also disulfide 
bridges are used, which can be cleaved under reducing conditions. 

However, in applications in which proteins (e.g. antibodies, enzymes) are 
to be linked to nucleic acids (i.e. for the detection of nucleic acids), no specific and 
reproducible linkage to the nucleic acids can be established, due to the fact that during 
chemical fimctionalization or activation of functional groups on the surface of the 
protein, no precise selection of amino acid side chains is possible and therefore neither 
the attachment site nor the stoichiometry can be controlled. Therefore, the results 
obtained can be different from batch to batch which negatively influences the generation 
of quantitative nucleic acid detection systems. In addition, there is no control over 
whether the amino add side chain is incorporated into the active site. These factors all 
reduce the technical value of such procedures. 

The application of solid phase techniques simplifies the preparation and 
purification of the reaction products, which is important for subsequent analytical and 
biochemical procedures. Since in some cases cleavage of one of the products from the 
support is needed, e.g. for further biochemical reactions in solution or signal detection, a 
combination of at least two different reversible linkages cleavable under mild and 
selective conditions is needed. 

Suinmary of the Invention 

In one aspect, the invention features compositions comprised of at least 
two biopolymers (e.g. nucleic acids or polypeptides), which are conjugated to an 
insoluble support by two different reversible linkages, which are cleavable under selective 
conditions. 

In another aspect, the invention features novel methods and components 
for specifically conjugating biomolecules under completely controlled stoichiometry 
based on the specific and strong interaction between chelators in the presence of metal 
ions. In one embodiment, imidazolyi moieties are introduced via the introduction of 

Z 
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histidine residues (e.g. oligo-His) into a polypeptide (e.g. by recombinant DNA 
techniques). The oligo-His polypeptide can then interact in the presence of a metal with 
a nucleic acid carrying a chelator functionality at a position which is exposed and does 
not interfere with Watson-Crick base pairing of the nucleic acid. In another embodiment, 
which is particularly well-suited for the attachment of biomolecules other than 
polypeptides or for the reversible immobilization of nucleic add molecules, the nucleic 
acid can carry a series of imidazolyl functionalities in a format which makes them 
available for chelation and which does not interfere with Watson-Crick base pairing; in 
which case, the other conjugating partner molecule can carry the chelator functionality. 

By combining this reversible concept with other reversible or irreversible 
linkages, novel biochemical formats including diagnostic assays are possible in which 
favorable solid phase procedures are coupled with sensitive detection principles. 

Bricf Pescription of the Figures 

Figure 1 (a) and (c) pictorially depict two general approaches of the 
invention in which a spacer molecule, A, linked to a polymer support, P, forms a 
reversible linkage, I, to a nucleic acid or protein/peptide molecule, B, which itself is 
linked by another reversible linkage, II, to either a nucleic acid, protein/peptide or small 
molecule (e.g. reporter molecule). Linkage I can be a heterobifunctional trityl group or a 
hydrophobic interaction stable under aqueous conditions or a photocleavable bond and II 
can be a bond, which is generated through a chelate complex. The two parts which form 
the linkage can be reversed (T, IT) as shown in (b) and (d). 

Figure 2 schematically depicts a nucleic acid molecule, B, which is linked 
through a spacer, A, via a reversible linkage, I, to a polymer support, P. B interacts via 
Watson-Crick complementarity with a nucleic acid molecule, C, which in turn through 
another reversible linkage II allows interaction with a reporter functionality D which can 
be a protein (enzyme), a nucleic acid or a small detector molecule. 

Figure 3 schematically depicts the same approach as in Figure 2 with the 
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exception that B is linked to the polymer support through a spacer A with a non- 
reversible linkage. 

Figure 4: shows an example of the chelate complex formed between a six 
residue histidine (his 6 ) tail and nitrilotriacetic acid (NTA) in the presence of Ni 2 \ 

Figure 5 schematically depicts a reaction, wherein a synthesized, protected 
N,N-dicarboxymethj4-serine phosphoamidite is synthesized as a chemical building block 
to introduce the NTA functionality into synthetic oligonucleotides. 

Figure 6 shows the synthesis of a chelate-linked oligonucleotide to a his r 
BAP (bacterially generated alkaline phosphatase) conjugate by use of the phosphoamidite 
chelate precursor. 



Figure 7 shows the synthesis of a chelate-linked oligonucleotide to his r 
BAP conjugate via retritylation and subsequent substitution with a chelate building block. 

Figure 8 shows the structure of imidazolyl phosphoamidite building blocks 
for the single or multiple addition of an imidazolyl moiety during chemical 
oligonucleotide synthesis. 

Figure 9 depicts the introduction of an imidazolyl moiety through an 
imidazolylnucleoside phosphoamidite. 

Figure 10 shows the introduction of multiple imidazolyl moieties through 
chemical peptide synthesis of oligohistidine onto an oligonucleotide during solid phase 
chemical synthesis of oligonucleotides. 

Figure 1 1 shows the chelate modified uracil and adenine nucleoside 
triphosphates for the enzymatic introduction of chelate functionalities into nucleic acids. 
Corresponding derivatives can be envisioned for cytidine, guanine or modified 
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nucleosides. 

Figure 12 shows imidazolyl modified uracil and adenine nucleoside 
triphosphates for the enzymatic introduction of imidazolyl moieties into nucleic acids. 
Corresponding derivatives can be envisioned for cytidine, guanine and modified 
nucleosides. 

Figure 13 schematically depicts solid phase separation/detection using 
NHS-DMT oligonucleotides linked to a solid phase and subsequently linked to a BAP- 
his 6 detector molecule via the LCR (Ugase Chain Reaction). 

Figure 14 schematically depicts the detection of Polymerase Chain 
Reaction (PCR) products via the process of the invention. 

Detailed Description of the Invention 

As shown in Figure 1, two different reversible linkages I and II (a,c), 
which could be positioned with their functionalities reversed (T,IT; b, d) are used to link 
"biomolecules" or "biopolymers" (i.e.organic molecules, including nucleic acids, 
peptides, polypeptides), to an insoluble support. The circled P represents an insoluble 
or solid support. 

"Insoluble supports" or "soluble supports" as used herein can be flat such 
as membranes, glass plates, metals, plastic films and composites thereof with a 
homogeneously functionalized surface or fiinctionalized to result in an array format 
including flat supports with pits, wells, combs, microtiter plates, microliter filter plates; 
flat supports can also be magnetic or with an array shaped (checkered) magnetic field; 
solid supports can also be used as beads from different plastic materials, inorganic 
supports such as silica, GPG (Controlled Pore Glass), metal, different polymeric material, 
cellulose, Sephadex, Sepharose; the beads can be porous or non-porous, of different 
diameter and magnetic or non-magnetic. Also a combination of beads in the pits/wells of 
flat supports thus forming an array format can be employed. 
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Compound A can be a spacer, a nucleic acid sequence (or nucleic acid 
analog/mimetic) or a protein or peptide sequence, B can be a nucleic acid (or a nucleic 
acid analog/mimetic) or a peptide or protein, whereas C can be nucleic add (or a nucleic 
acid analog/mimetic), protein/peptide or a small reporter molecule. As an example A is a 
spacer and I is a heterobifunctional trityl group which is coupled to a nucleic acid B; B 
carries a chelate functionality which interacts with the poly-his tail of a recombinant 
alkaline phosphatase (his 6 -AP), which carries e.g. a sequence of six histidine residues at 
the C-terminal end of the polypeptide chain. If a chromogenic or fluorogenic substrate is 
added, for example, dephosphorylation generates color or light thereby providing a 
nucleic acid detection system. The advantage of this system is that the detection can be 
done either on the insoluble support or after releasing B from the support by cleavage of 
bond I (or V). It is therefore possible to remove all side-products from a reaction by 
filtration due to the attachment to a solid phase before performing the analytical step in 
solution. This leads to a robust, reproducible performance. 

Figure 2 shows schematically how amplification (e.g. polymerase chain 
reaction (PCR) or ligase chain reaction (LCR) products B-C can be captured specifically, 
purified and subsequently detected on the support or in solution. The first reversible 
linkage I (or T) e.g. a heterobifunctional trityl group anchors one strand of the LCR or 
PGR product via a spacer A to the support through an acid labile tritylether bond the 
precursor of which has been introduced by an appropriately functionalized primer during 
the LCR or PCR reaction. The strand C carries e.g. the chelate functionality also 
introduced by using an appropriately functionalized primer during PCR or LCR. The 
chelated moiety can then interact with a reporter functionality e.g. his 6 -AP for subsequent 
detection and quantification of amplification product. B can also be a cDNA molecule 
which can be linked through its 5'-end to the polymer support. With appropriate primers, 
solid phase DNA sequencing can be performed. Considering an array format, this could 
be used for high throughput genetic and expression profiling experiments. 

As shown in Figure 2, B could also be a specific (or oligo-dT) capture 
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sequence to fish mRNA. The cDN A can be directly synthesized since the capture 
sequence simultaneously can act as a primer for the RNA dependent DNA polymerase. 
The RNA can be removed, the cDNA purified by washing and filtration steps and either 
released or directly used for subsequent DNA sequencing. It can also be envisioned that 
the capture sequence while serving as a primer for the RNA dependent DNA polymerase 
can be used directly to generate sequencing ladders employing ddNTP's as terminators. 
After purification of the sequencing ladders by washing and filtration, the bond to the 
polymer support is cleaved and the purified sequencing ladders subjected to either gel 
electrophoretic or mass spectrometric separation (H. Koster et al, A Strategy for Rapid 
and Efficient DNA Sequencing by Mass Spectrometry, Nature Biotech, (1996) 14, 1 123- 
1 128; U.S. Patent No. 5, 547,035 to H. K6ster, International Patent Application No. 
W094/21822 to H. Kdster; and International Patent Application No. W096/2943 1 to H. 
Koster) 

Figure 3 shows a simplified version of Figure 2 in that nucleic acid 
fragment B is immobilized through a non-reversible bond via a spacer A to the solid 
support whereas nucleic acid C carries the reporter functionality via a reversible linkage 
so that detection can be performed either on the support or in solution. 

In Figures 1-3, biopolymer C or D could be synthetic peptides linked to an 
immobilized nucleic acid B or B-C respectively via a reversible linkage as described 
(heterobifiinctional trityl, photocleavable, chelate, hydrophobic interaction) which is then 
detected by mass spectrometry. Various defined peptide sequences can form a specific 
mass tag which can be used as a specific nucleic add identifier. Conversely specific 
nucleic acid sequences can be used as mass tags (specific identifiers) for proteins 
immobilized through a spacer A. 

For use in the instant process, nucleic acids can be single stranded or 
double stranded polynucleotides (including oligonucleotides), whether natural or 
synthetic, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) or 
DNA/RNA hybrids, DNA containing ribonucleotides and/or dideoxyribonucleotides and 
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RNA containing deoxyribonucleotides. Also encompassed by the term "nucleic add" are 
modified nucleotides (e.g. phosphorothioate modified) as well as nucleic acid mimetics or 
analogs, such as peptide nucleic acids (PNAs). 

As used herein, the terms "protein", "polypeptide" or "peptide" are all 
used interchangeably to refer to gene products. Proteins can be antibodies, enzymes, 
receptor molecules; peptides could be of natural or synthetic origin with oligo-his tail, a 
functionality for hydrophobic interaction, a photocleavable functionality or chelator 
functionality and displaying different properties such as being adhesive or representing 
specific ligand-receptor or specific protease cleavage sites. 

As used herein, the term oligo his tail or poly his tail refers to a chain of 
conjugated histidine residues. Preferred oligo his tails contain 2-10 histidine residues. 
Particularly preferred oligo his tails are in the range of about 4 to about 8 his residues. 
Reversible linkages can be formed by hydrophobic interaction between e.g. a trityl group 
(i.e. with long aliphatic alkyl chains) and a long aliphatic chain e.g. attached to a polymer 
support or a hydrophobic polymer surface such as that of polystyrene. Since most 
biochemical and molecular biological reactions are performed in aqueous solution such 
hydrophobic interaction might be of sufficient stability. Addition of organic solvents such 
as alcohols, acetonitrile, N.N-dimethylformamide and the like will destabilize (if 
necessary in conjunction with heat) the hydrophobic interaction and release the attached 
molecules. 

A reversible linkage which can independently be addressed could also be a 
functionality which is cleavable under photolytic conditions (see e.g. J. Olejnik, E. 
Krzymanska-Olejnik and K J. Rothschil, Photocleavable Biotin Phosphoramidites for 5- 
End-labeling, Affinity purification and Phosphorylation of Synthetic Oligonucleotides 
(1996) Nucleic Acids Res., 24, 361-366). If the wavelength needed for photocleavage is 
in the range of the laser wavelength used in MALDI mass spectrometry, this bond can be 
cleaved during mass spectrometric signal acquisition. 

* 
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A reversible linkage can also be formed from a chelator functionality 
which interacts with another chelator (e g oligo-imidazolyl or other oligopeptide 
moieties) in the presence of a metal ion. The term "chelator" refers to a single molecule, 
which comprises at least two Lewis basic atoms that are capable of associating 
simultaneously with a Lewis acidic atom, moelcule or ion- either simple or complex. 
"Lewis base" is an art recognized term that refers to chemical moieties, which are capable 
of donating to another atom or moiety at least one pair of unshared electrons. Examples 
include uncharged functional groups such as alcohols, ethers, carbonyls, thiols, sulfides, 
amines, imines, and pyridine and imidazole nitrogens; and charged functional groups, 
such as alkoxides, thiolates, carboxylates and a variety of other anions. "Lewis acid" is 
an art recognized term that refers to chemical moieties, which are capable of accepting 
from another atom or moiety (e.g. a Lewis basic atom or moiety) at least one pair of 
unshared electrons. Examples of Lewis acid moieties include transition metal halides, 
with at least one vacant d orbital, alkali metal cations, alkaline-earth metal cations, and 
trivalent boron or aluminum compounds. A "bidentate chelator", Edentate chelator" 
and "tetradentate chelator" refers to chelators comprising two, three and four Lewis basic 
moieties, respectively, capable of simultaneous donation of at least an equal number of 
unshared electron pairs to another atom, ion or moiety. 

Figure 4 depicts a specific example in which the chelator functionality is a 
nitrilotriacetic acid (NT A) which coordinates with divalent metal cations such as Ni 2+ and 
forms a strong complex with six imidazolyl groups from a his 6 tail linked to one of the 
conjugating partner molecules. The term "imidazolyl residue" or "imidazoyl group" 
refers to any substituted or unsubstituted form of imidazole (i.e. l,3-diaza-2,4- 
cyclopentadiene). For example, the side chain of the amino add histidine comprises an 
imidazolyl residue. 

The determination of which of the two necessary functions is 
attached to the nucleic acid molecule or the protein depends on the ease and convenience 
of introduction of either functionality (e.g. NTA or his 6 tail). In case of proteins the site- 
specific introduction of a chelator molecule seems to be difficult whereas the his 6 tail can 

1 
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be introduced through recombinant DNA technologies. In contrast to currently available 
procedures, for linking nucleic acids to proteins (e.g. chemical linkage using either 
maleimide-thiol coupling (S.S. Gosh et al. (1990), Bioconjugate Chem. X. 71-76), 
disulfide bonds (B.C.F. Chu L.E. Orgel (1988) Nucleic Acids Res. 1£, 3671-3691) or 
mediated via streptavidin, which binds both biotinylated nucleic adds and biotynylated 
alkaline phosphatase (AP) (J J. Leary et al. (1983) Proc. Natl. Acad ScL USA 80, 4045- 
4049)), the introduction of the his6 tail through recombinant DNA technologies allows 
site-specific introduction. 

As an example which does not limit the scope of this invention, the 
process is explained for alkaline phosphatase (AP) as protein. Alkaline phosphatase (EC 
3. 1.3.1) is a versatile enzyme for many molecular biological applications. It catalyzes the 
hydrolysis of ester bonds in phosphomonoesters and is used in recombinant DNA 
technology to remove 5-phosphate groups from DNA fragments to prevent self-litigation 
of vector DNA molecules. Coupled to antibodies or oligonucleotides, it replaces 
radioactively labeled compounds by serving as a reporter and signal amplifying enzyme 
which cleaves chromogenic or fluorogenic substrates in diagnostic applications for the 
specific detection of DNA (Southern blot: E.M. Southern (1975) J. MolBiol 28, 503- 
517) or proteins (Western blots: W.N. Burnett (1981) Anal Biochem. 112, 195-203). 

Predominantly, AP is isolated from calf intestine (CIP) or the bacterium E 
coli. (BAP). AP consists of a homodimer. The stability of the enzyme, of advantage in 
diagnostic applications, can lead to severe problems in cloning experiments. Residual AP 
activity from the dephosphoryiation of vector DNA can result in dephosphorylation of 
the DNA to be inserted so that no or only low yields of ligation products are obtained. 
Heat inactivatton very often is not sufficient so that time-consuming removal is necessary 
using treatment with proteinase K and subsequent extraction from phenol/chloroform. 
This lengthy procedure will also drastically reduce the yield of the product. Alteratively, 
AP isolated from species living at low temperatures (shrimps) are employed; here heat 
inactivation is possible, however, reduced stability is disadvantageous for diagnostic 
applications. 
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A modified BAP derived from E. coli was genetically designed with a his 6 
tail at its carboxy terminus. The his 6 tail was introduced using inverse PCR by which sue 
histidine codons followed by a stop codon were placed at the 3' end of the gene (E. Blum 
et al. (1994) Biochem Biophys 1 22, 113-121). To achieve high expression levels of the 
recombinant enzyme in E. coli, the region coding for the signal peptide of AP together 
with the untranslated 5' and 3* regions were exchanged with homologous sequences from 
the E. coli ompA gene. The expression of the resulting protein construct was under the 
control of the BPTG (P-D-isopropyl-tWo-galactoside) inducible ptaopromoter. 

The BAP-his 6 synthesized in the E.coli cell can easily be isolated from an 
unpurified cell extract through affinity chromatography using commercially available Ni- 
NT A resins (Qiagen) to which it forms a strong and specific chelate complex via its his 6 
tail. The modified enzyme is therefore now available in high yields, high purity and 
reproducible batch-to-batch quality. As part of the inventive process, BAP-his* is able to 
form with chelate-modified nucleic acids, a stable complex which for the first time makes 
available specific conjugates between proteins (here BAP) and nucleic acids in a 
reproducible 1:1 stoichiometry. 

When peptides are generated by chemical synthesis, the his 6 tail can be 
directly incorporated during peptide synthesis. Chemical synthesis of peptides also 
allows the alternative approach in which a chelator functionality is attached to the 
synthetic peptide either at the N- or C- terminus or one of the side chains depending on 
which part of the peptide sequence is needed for the biochemical function. 

The nucleic acid molecule can be fiinctionalized either with the imidazoyl 
moieties or with the chelator functionalities. In case of synthetic oligonucleotides the 
chelator functionality can be introduced in different ways. An amino acid such as serine, 
cysteine or lysine can be transformed into a P-cyanoethylphosphoamidite (N.D. Sinha, J. 
Biernat, J. McManus and H. K6ster (1984) Nucleic Acids Res. 12, 4539-4477) carrying a 
precursor of the chelator functionality (e.g. NTA as described in Figure 5 and 6 with 
serine as starting material). During deprotection after solid phase oligonucleotide 

W 
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synthesis, the three carboxyl groups are liberated forming a NT A (nitrilotriacetic acid) 
group linked through a phosphodiester bond to the oligonucleotide chain. In yet another 
way, Figure 7 shows the introduction through a heterobifunctional trityl group. The 
oligonucleotide is, after regular final detritylation, retritylated with a heterobifunctional 
trityl group bearing an active ester moiety derived from either e.g. N- 
hydroxysuccinimide or employing active esters such as p-nitrophenyl esters. The active 
ester functionality is then reacted with a chelator molecule derived from e.g. lysine. 

The imidazolyl functionality can be introduced during oligonucleotide 
synthesis employing an appropriate P-cyanoethylphosphoamidite as shown in Figure 8; 
single or multiple imidazolyl residues can be incorporated. A imidazolylnucleoside as 
shown in Figure 9 or a histidine peptide sequence covalently attached to the 
oligonucleotide chain (Figure 10) can also be used to introduce the necessary imidazolyl 
moieties for interaction with the chelator functionality. 

The chelator and oligoimidazolyl functionalities can also be introduced in 
high molecular weight nucleic acids using either DNA dependent DNA or RNA 
polymerases or RNA dependent DNA polymerases using appropriately modified 
nucleoside triphosphates (either NTPs, 2-dNTP, S'-dNTPs, ddNTPs) as depicted in 
Figure 1 1 . The base will carry either the chelator or the oligoimidazoyl functionality 
(Figure 12) in case of pyrimidine bases at C5 and in case of purine bases at C8 so that 
Watson-Crick base pairing is possible. Using the appropriate nucleoside triphosphates 
those functionalities can either be introduced internally (NTP for RNA synthesis or 2'- 
dNTP for DNa synthesis) or at the 3'-end (3'-dNTP for RNA synthesis, ddNTP for DNA 
synthesis). The incorporation can be performed during amplification procedures such as 
PCR, SDA or during DNA sequencing. Those skilled in the art will realize other 
approaches to introduce either chelator or oligo-imidazolyl moieties into nucleic acids. 

Detection of the immobilized nucleic acid-protein/peptide conjugates can 
be achieved either directly on the polymer support or after selective cleavage of either 
reversible bond I (r) or II (IT). The signal can be detected by any of a number of means 
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including radioactivity, fluorescence, chemiluminescence (using e.g. 1,2-dioxetan 
derivatives) or colorimetric (using e.g. BCIP/NBT) methods depending on the substrates 
used as C or D Fig. 1, 2 and 3). D can be an enzyme such as AP which triggers upon 
contact with a substrate through its enzymatic activity the signal generation. C and D can 
also be detected through their molecular weight by employing mass spectrometric 
methods. Preferred mass spectrometer formats for use in analyzing the translation 
products include ionization (I) techniques, including but not limited to matrix assisted 
laser desorption (MALDI), continuous or pulsed electrospray (ESI) and related methods 
(e.g. Ionspray or Thermospray), or massive cluster impact (MCI); these ion sources can 
be matched with detection formats including linear or non-linear reflectron time-of-flight 
(TOF), single or multiple quadrupole, single or multiple magnetic sector, Fourier 
Transform ion cyclotron resonance (FTICR), ion trap, and combinations thereof (e.g., 
ion-trap/time-of-flight). For ionization, numerous matrix/wavelength combinations 
(MALDI) or solvent combinations (ESI) can be employed. Subattomole leels of protein 
have been detected, for example, using ESI (Valaskovic, G.A. et al, (1996) Science 222.: 
1199-1202) or MALDI (Li, L. et al., (1996) J Am Chem Soc US: 1662-1663) mass 
spectrometry. 

The process of the invention is further demonstrated by solid phase 
separation and detection of Ligase Chain Reaction (LCR) products as seen in Figure 13 
and products of PGR reactions (Figure 14). To those skilled in the art it is obvious that 
all applications and variations of amplification procedures including those useful for the 
detection of mutations and DNA/RNA sequencing are all adaptable to the process of the 
invention thereby significantly improving such processes. 

The present invention is further illustrated by the following Examples, 
which are intended merely to further illustrate and should not be construed as limiting. 
The entire contents of all cited references (including literature references, issued patents, 
published patent applications and co-pending patent applications, as cited throughout this 
application) are hereby expressly incorporated by reference. 
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Example 1 BAP-his g Fusion Protein 

The phoA gene coding for the BAP of E. coli (P.E. Berg (1981) 1 
Bacteriol 660-667; C.N. Chang et al. (1986) Gene 41 121-125) was derived from 
E. coli strain HB 1 0 1 The his 6 fusion at the carboxyterminus was generated via inverse 
PCR with six his codons followed by a stop codon derived from plasmid pHis 1. (E. Blum 
et al. (1994) Biochem. Biophys. J. 22, 113-121); 

To increase the expression rate of the recombinant BAP-his 6 protein, its 
reading frame was embedded in the untranslated regions of the E. coli ompA gene (Chen 
et al. (1991) I Bacteriol JJi, 4578-4586), coding for protein OmpA, which is a major 
protein constituent of the outer membrane in Gram-negative bacteria. In addition, the 
signal peptide of BAP (H. Inouye and J. Beckwith (1977) Proc. NatL Acad. ScL USA B, 
1440-1444) and the first two amino acids of the mature protein were replaced by the 
OmpA leader peptide and the first amino acid residue of mature OmpA, resulting in a 
mature chimeric BAP with the amino acid alanine instead of arginine-threonine at its N- 
terminus. 

To bring the expression of the chimeric BAP-his 6 under the control of 
DPTG inducible chimeric tac-promoter (T. Amann et al. (1983) Gene 2£, 167-178), a 2.5 
kb EcoRI-PstI fragment containing the complete open reading frame of the ompA-phoA 
chimera) and the untranslated regions from the omp A gene was cloned into the 
expression vector pHK236 (a derivative of pJF118u: Furste et al. (1986), kindly provided 
by M. Krftger, Giessen) to generate the BAP-his 6 expression plasmid vector pBAPfflS8. 
Expression is achieved by induction of logarithmic E. coli culture harboring plasmid 
pBAPHIS8 with IPTG in a final concentration of 1 mM for 2 h under shaking in a 37°C 
incubator. Isolation of BAP-his 6 is carried out according to developed protocols on Ni- 
NTA-Agarose (E. Hochuli et al. (1987)7. Chromatography Hi, 177-184). 
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Example 2 Dephosohorvlation of DNA Fragments with Solid Phase Bound BAP-hk c 

A solution containing DNA fragments is incubated with beads carrying 
immobilized metal ions complexed with BAP-his 6 protein. To remove the enzymatic 
activity after the reaction is carried out, filtration or centrifugation removes beads whh 
adsorbed enzyme. Alternatively, a solution containing DNA fragment can be filtered 
through a derivatized membrane, carrying immobilized metal ions complexed with BAP- 
his 6 protein. 

Example 3 Detection of LCR Products in Microliter Filter Plates 

The use of BAP-his 6 as a reporter enzyme for LCR is carried out in the 
wells (96 or more) of a microti ter filter plate (MTFP) with 96 samples with oligos A-D 
(Figure 13). One of the oligos (oligo A being the marker oligo, Fig. 13) carries at its 5'- 
end a chelating group. In the presence of a template DNA the marker oligo is 
incorporated into one strand, the marker strand, consisting of oligos A and B, with B 
ligated to the 3-end of oligo A. Under denaturing conditions (or after denaturing), 
ligation products, oligos and other smaller by-products are transferred by suction into a 
second MTFP with a derivatized filter membrane. To this filter, oligo D or part of it with 
sequence complementary to oligo B is coupled via NHS-DMT (heterobifunctional trityl 
derivative) linkage. Hybridization occurs between membrane bound oligo D and oligo B 
or the marker strand AB. After removal of supernatant and washing, only oligo A 
incorporated in the marker strand AB by ligation remains in the wells of the MTFP. 
B AP-his 6 and a divalent cation such as Ni 2+ are incubated in the wells under adequate 
conditions to allow coupling of BAP-his 6 to the marker strand. After removal of 
unbound BAP-his 6 by washing and filtration, chromogenic or fluorescent AP substrates 
are added. Only wells containing the LCR product show AP activity as a positive result, 
bound D alone or the single strand CD cannot give rise to any signal. The experimental 
setup allows multiplex LCR by employing a mixture of oligos in the LCR and subsequent 
transfer of the LCR products by suction through a stack of different MTFP with specific 
bound oligo sequences. This experiment setup is amenable to automation, since the 

15 
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reaction can be carried out e.g. in filter tubes or filter plates, which allow removal of 
contaminating agents, buffer changes and even detection in situ by dispensing and 
filtration of different liquids. 

Example 4 Sequence Specific Detection of PCR Fragments 

PCR is carried out in crude cell lysates with a derivatized oligonucleotide 
primer (Figure 14). After denaturing, the PCR reaction is filtrated through a membrane 
derivatized with a capture oligo. It can contain any sequence, which is complementary to 
the expected PCR fragment and hybridizes with strand elongated from derivatized oligo. 
Although any nucleic acid containing the sequence complementary to the capture oligo 
will be retained on the membrane, only PCR products containing the derivatized 
oligonucleotide primer can bind the modified BAP-his 6 enzyme. The PCR product is 
detected by BAP activity retained on the membrane after adequate washing procedure. 
This setup allows PCR with crude lysates, since contaminating agents can be removed by 
filtration and only the PCR products retained by hybridization to the membrane bound 
oligonucleotide give rise to a detectable signal. This setup is also amenable to 
multiplexing (see above). 

Those skilled in the art will recognize or be able to ascertain using no 
more than routine experimentation, numerous equivalents to the specific procedures 
described herein. Such equivalents are considered to be within the scope of the invention 
and are covered by the following claims. 
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We claim, 

1. A composition comprised of at least two biopolymers conjugated 
to an insoluble support by at least one reversible linkage. 

2. A composition according to claim 1, wherein the at least two 
biopolymers are comprised of nucleic acids. 

3. A composition according to claim 1, wherein the at least two 
biopolymers are comprised of polypeptides. 

4. A composition according to claim 1, wherein the at least two 
biopolymers are comprised of a nucleic acid and a protein. 

5. A composition according to claim 1, wherein the at least one 
reversible linkage is formed through a trityl derivative, a chelate complex, a hydrophobic 
interaction or a photocleavable functionality. 

6. A composition according to claim 1, wherein the insoluble support 
is selected from the group consisting of: a flat surface, a comb and a bead. 

7. A composition according to claim 6, wherein the insoluble support 
is selected from the group consisting of: a silicon wafer, glass plate, metal, plastic, film 
and composites thereof with pits or wells. 

8. A composition according to claim 7, wherein the biopolymer is 
conjugated to the insoluble support in an array format. 

9. A composition according to claim 7, wherein the bead is 
comprised of an inorganic material selected from the group consisting of: silica, 
Controlled Pore Glass (CPG), plastic, metal, cellulose, Sepharose and Sephadex. 
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10. A composition according to claim 6, wherein the insoluble support 
is comprised of a magnetic or electromagnetic material. 

11. A composition according to claim 2, wherein the nucleic acid is 
selected from the group consisting of: deoxyribonucleic acid (DNA), ribonucleic acid 
(RNA) or analogs or mimetics of DNA or RNA. 

12. A compostion according to claim 3, wherein the polypeptide is 
selected from the group consisting of an antibody, enzyme, receptor or peptide. 

13. A composition according to claim 1, which contains a spacer 
between the biopolymer and the insoluble support. 

1 4. A composition according to claim 4, which is made by the 
formation of a chelate complex between the nucleic acid and the polypeptide. 

15. A composition according to claim 14, wherein the chelate complex 
is formed by the reaction of a nucleic acid containing a chelate functionality with a 
polypeptide containing an imidazoyl functionality in the presence of a metal ion. 

16. A composition of claim 14, wherein the chelate complex is formed 
by the reaction of a nucleic acid containing an imidazoyl functionality with a polypeptide 
containing a chelate functionality in the presence of a metal ion. 

17. A composition according to claim 1 5 or 16, wherein the 
polypeptide is an enzyme. 

18. A composition according to claim 17, wherein the enzyme is an 
alkaline phosphatase. 



19. 



A method according to claim 18, wherein the enzyme is bacterial 
it 
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alkaline phosphatase (BAP). 

20. A method for making a composition of claim 1 , comprising the 

steps of: 

a) immobilizing a nucleic acid to an insoluble support via a 
first reversible linkage; and 

b) conjugating said nucleic acid with a polypeptide via a 
second reversible linkage. 

21 . A method according to claim 20, wherein the first or second 
reversible linkage is formed through a trityl derivative, a chelate complex, a hydrophobic 
interaction or a photocleavable functionality. 

22. A method according to claim 20, wherein step b), the first or 
second reversible linkage forms a chelate complex. 

23 . A method according to claim 22, wherein the first or second 
reversible linkage is formed by the reaction of a nucleic acid containing a chelate 
functionality with a polypeptide containing an imidazoyl functionality in the presence of a 
metal ion. 

24. A method according to claim 22, wherein the first or second 
reversible linkage is formed by the reaction of a nucleic acid containing an imidazoyl 
functionality with a polypeptide containing a chelate functionality in the presence of a 
metal ion. 

25 . A method according to claim 20, wherein the first or second 
reversible linkage are formed firom functionalities or precursors, which are introduced 
into the nucleic acid during enzymatic synthesis. 

26. A method according to claim 25, wherein the enzymatic synthesis 
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27. A method of claim 26, wherein the amplification procedure is 
selected from the group consisting of the polymerase chain reaction (PCR), the ligase 
chain reaction (LCR) and strand displacement amplification (SDA).. 

28. A method according to claim 25, wherein the enzymatic synthesis 
is part of a nucleic acid sequencing procedure. 

29. An oligonucleotide analog comprised of a (J- 
cyanoethylphosphoamidite functionality with a chelate functionality. 

30. An oligonucleotide analog of claim 29, wherein the chelate 
functionality is a precursor of nitrilotriacetic acid derived from either serine, cysteine or 
lysine. 

31. An oligonucleotide analog comprised of a heterobifunctional trityl 
group with a chelate functionality. 

32. An oligonucleotide analog of claim 3 1 , wherein the chelate 
functionality is a precursor of nitrilotrisacetic acid derived from serine, cysteine or lysine. 

33 . An oligonucleotide analog comprised of a P- 
cyanoethylphosphoamidite functionality with an imidazolyl functionality. 

34. An oligonucleotide analog comprised of a heterobifunctional trityl 
group with a oligohistidyl or oligoimidazolyl sequence. 

35. An oligonucleotide analog according to claim 34, wherein the 
oligohistidyl sequence is present at the 5 - or 3 - terminus. 
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36. An oligonucleotide analog comprised of an imidazolylnucleoside- 
P-cyanoethylphosphoamidite. 

37. A member selected from the group consisting of: nucleoside 
triphosphates, T-deoxynucleoside triphosphates, 3 , -deoxynucleoside triphosphates and 
2 , ,3 , -dideoxynucleoside triphosphates, wherein the member contains a chelate 
functionality at either C5 in the pyrimidine ring of thymine, uracil, or cytidine or at C8 in 
the purine ring of adenine, guanine or hypoxanthine. 

38. A member selected from the group consisting of: nucleoside 
triphosphates, 2'-deoxynucIeoside triphosphates, S'-deoxynucleoside triphosphates and 
2 , ,3 , -dideoxynucleoside triphosphates, wherein the member contains an oligohistidyl or 
oligoimidazolyl chain at either C5 in the pyrimidine ring of thymine, uracil, or cytidine or 
at C8 in the purine ring of adenine, guanine or hypoxanthine. 

39. A recombinant protein which carries at its C-terminus an 
oligopeptide chain, which is capable of forming a chelate complex in the presence of 
metal ions. 

40. A recombinant protein according to claim 39 which has enzymatic 

activity. 

41. A recombinant according to claim 40, which is an alkaline 
phosphatase, which has an alanine residue at its N-terminus instead of arginine-threonine 
and which has at its C-terminus a chain of six histidine residues. 



42. A peptide which carries at its N- or C- terminus an oligohistidyl 
sequence, which is capable of forming a chelate complex in the presence of metal ions. 

43 A peptide which carries at its N- or C- terminus a chelator 
functionality which is capable of forming a chelate complex in the presence of metal ioi 

OS 
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44. A composition of claim 1, wherein the insoluble support is linked 
via a spacer to the nucleic acid through a reversible heterobi&nctional trityl group and 
the nucleic acid is conjugated to an enzyme through a reversible chelate functionality. 

45. A scomposition according to claim 44 in which the polymer 
support is comprised of magnetic beads, the chelate complex is formed via the 
nitrilotriacetic add functionality in the presence of NP and the enzyme is BAP-his 6 . 

46. A composition according to claim 44 in which the polymer support 
is a silicon wafer carrying the reversible functionalities to bind the nucleic acid either 
directly on the surface or through beads in pits or wells in an array format, the chelate 
complex is formed via nitrilotrisacetic acid functionality in the presence of Ni 2 * and the 
enzyme is BAP-his 6 . 

47. A composition according to claim 44 in which the polymer support 
is the filter bottom in the wells of a microtiter filter plate, the chelate complex is formed 
via nitrilotrisacetic acid functionality in the presence of Ni 2+ and the enzyme if BAP-his 6 . 

48. A method of using the composition according to claim 44 to purify 
and to detect products of nucleic acid amplification procedures. 

49. A method of claim 48, wherein the amplification procedure is 
selected from the group consisting of: the polymerase chain reaction, the ligase chain 
reaction and strand displacement amplification. 

50 A method for using the composition according to claim 44 for 
determining the sequence of a nucleic acid. 



51. A method for using the composition according to claim 44 to 
purify and to detect the identity and relative quantity of mRNAs or their corresponding 
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cDNAs for genetic or expression profiling. 

52. A method for using the composition according to claim 44 to 
purify and to detect products of nucleic acid amplification procedures. 
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1. Claims: 2,11 completely; 1,5-10,13 partially 

Composition comprising at least two biopolymers conjugated 
to an insoluble support by at least one reversible linkage. 
Wherein the at least two biopolymers are comprised of 
nucleic acids. 



2. Claims: 3,12 completely; 1,5-10,13 partially 

Composition comprising at least two biopolymers conjugated 
to an insoluble support by at least one reversible linkage. 
Wherein the at least two biopolymers are comprised of 
polypeptides. 



3. Claims: 4,14-28 completely; 1,5-10,13 partially 

Composition conprising at least two biopolymers conjugated 
to an insoluble support by at least one reversible linkage. 
Wherein the at least two biopolymers are comprised of 
nucleic acid arid a protein. 



4. Claims: 1,5-10,13 partially 

Composition comprising at least two biopolymers conjugated 
to an insoluble support by at least one reversible linkage. 
Wherein the at least two biopolymers are not comprised of 
two nucleic acids, two polypeptides, or a nucleic acid and a 
protein. 



5. Claims: 29,30,33,36 

Oligonucleotides comprised of a B-cyanoethylphosphoramidite. 



6. Claims: 31,32,34,35 

Oligonucleotides comprised of a heterobi functional trityl 
group. 



7. Claims: 37,38 

Nucleoside triphosphates with a chelate functionality on the 
base moiety 



8. Claims: 39-41 

Recombinant proteins with enzymatic activity. 
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9. Claims: 42,43 

A peptide with a chelator at the N or C terminus. 
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arbitrary oligonucleotides. PCR amplification products were labeled using a-32P-dCTP and were 
visualized by autoradiography after electrophoresis on denaturing polyacrylamide gels. A number 
of bands appeared to be differentially expressed, and were cloned as described above. 

UC Band #321 was confirmed by RT-PCR to be down regulated in the peripheral blood of 
prostate cancer patients, with a four-fold decrease observed compared with normal individuals. 
The DNA sequence of Band #321 does not match any known sequences in the GenBank database. 
It therefore represents a previously undescribed gene product. 

UC Band #302 and UC Band #325 were both observed to be up regulated in the peripheral 
blood of metastatic prostate cancer patients. UC Band #302 is identical in sequence to a portion of 
the sequence of elongation factor 1-a (GenBank Accession #X03558). This band was modestly 
increased between 1 .6 and 2-fold in metastatic cancer patients compared with normal individuals. 

UC Band #325 was found to consist of two different alternatively spliced forms of mRNA, 
encoded by the interleukin-8 (IL-8) gene. UC Band #325-1, the previously identified mRNA 
species of IL-8 (Genbank Accession # Y00787), is approximately seven-fold more abundant in the 
peripheral blood of metastatic prostate cancer patients. The alternatively spliced IL-8 mRNA, 
containing intron #3 of the IL-8 gene (Genbank Accession #M28130) is up to seven-fold less 
abundant in the peripheral blood of metastatic prostate cancer patients. Fig. 1 A shows relative 
quantitative RT-PCR of the differential expression of IL-8 (=UC235) in peripheral blood of 
patients with metastatic prostate cancer (M) and normal individuals (N) at different PCR cycles 
(cy). The two alternatively spliced forms of the IL-8 mRNA are observed. The upper band (int.+) 
includes intron 3 in the mature mRNA. The lower band (int.-) lacks intron 3. Fig. IB shows 
relative quantitative RT-PCR showing Differential Expression of IL-8 (UC325) in peripheral blood 
of patients with metastatic prostate cancer in lanes 1-5 and a pool of normal individuals (N). The 
alternatively spliced forms of the IL-8 mRNA observed are different between normal individuals 
and those with prostate cancer. Overall, there is an approximately 30-fold change in the ratios of 
the two spliced forms of IL-8 mRNA in individuals with metastatic prostate cancer compared with 
normal individuals. These results have been confirmed by relative quantitative RT-PCR. 

As described above, an increased expression of IL-8 mRNA has been previously reported in 
cancer patients. However, this represents the first finding of an alternatively spliced form of IL-8 
mRNA, containing intron 3, that is significantly more abundant in normal individuals compared 
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with metastatic prostate cancer patients. These results are surprising in view of previous reports 
which had failed to find any alternatively spliced forms of IL-8 mRNA in normal individuals or 
cancer patients. 

It will be recognized that the genes and gene products (RN As and proteins) for the above 
described markers of. metastatic prostate cancer arc included within the scope of the disclosure 
herein described. It will also be recognized that the diagnosis and prognosis of metastatic prostatic 
cancer by detection of the nucleic acid products of these genes are included within the scope of the 
present invention. Serological and other assays to detect these mRNA species or their translation 
products are also indicated. It is obvious that these assays are of utility in diagnosing metastatic 
cancers derived from prostate and other tissues. 

Most significantly, these Examples demonstrate the feasibility of using RNA fingerprinting 
to identify mRNA species that are differentially expressed in the peripheral blood of patients with 
asymptomatic diseases or in patients with symptoms that are insufficient for a definitive 
diagnosis. It will be appreciated that this technique is applicable not only to the detection and 
diagnosis of prostate and other cancers, but also to any other disease states which produce 
significant effects on lymphocyte gene expression. Uses which are contemplated within the scope 
of the present disclosure include the detection and diagnosis of clinically significant diseases that 
requires medical intervention, including but not limited to asthma, lupus erythromatosis, 
rheumatoid arthritis, multiple sclerosis, myasthenia gravis, autoimmune thyroiditis, ALS, interstitial 
cystitis and prostatitis. 

TABLE 2 

Genes Whose mRN As have Abundances that Vary in 
Metastatic Prostate Cancer Relative to Normal Individuals 



Name of 
cDNA Fragment 


Sequence 
Determined 


Confirmed 
by RT-PCR 


Previously 
Known 


UCPB 35 


Yes 


Yes 


GB #T03013 


UC 302 SEQ ID NO:3 


Yes 


Yes 


EFl-ct 


UC321 SEQIDN0:2 


Yes 


Yes 


No 


UC 325-1 SEQ ID NO:4 


Yes 


Yes 


GB #Y00787 


UC 325-2 SEQ ID NO:5 


Yes 


Yes 


IL-8 
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TABLE 3. 

Oligonucleotides used in the relative quantitative RT-PCR portion of these studies. 

Oligonucleotides used to examine the expression of genes: 

UCPB Band #35 (previously uncharacterized gene). 

5' TGCAAACTTTCACCTGGACTT3', SEQ ID NO: 10 
5' CTTGTGACTTGCTTTGATAGAATG3', SEQ ID NO:l 1 

UC Band #302 (elongation factor 1 -a). 

5' GACAACATGCTGGAGCCAAGTGC3', SEQ ID NO: 12 
5' ACCACCAATTTTGTAAGAACATCCT3'. SEQ ID NO: 1 3 

UC Band #321 (previously uncharacterized gene). 

5' TGTCCAGAGATCCAAGTGCAGAAGG3'. SEQ ID NO: 14 
5' GAGCTCC AGGAG AC AG AAGCCA TAG3\ SEQ ID NO: 15 

UC Band #325- l(IL-8). 

5' GGGCCCCAAGGAAAACT 3'. SEQ ID NO: 16 
5' TGGC AACCCTAC AAC AG ACC 3', SEQ ID NO: 17 

UC Band #325-2 (IL-8). 

5' GGGCCCCAAGGAAAACT3', SEQ ID NO: 18 

5' TGGCA ACCCTACAACAG ACC 3'. SEQ ID NO: 1 9 

Controls used to normalize relative quantitative RT-PCR 
6-actin 

5' CGAGCTGCCTGACGGCCAGGTCATC3', SEQ ID NO:8 
5' GAAGCATTTGCGGTGGACGATGGAG3', SEQ ID NO:9 
Asparagine Synthetase (AS) 

5' ACATTGAAGCACTCCGCGAC3', SEQ ID NO:20 
5' AGAGTGGCAGCAACCAAGCT3', SEQ ID NO:21 
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Example 4: 

DNA Sequences of Markers of Metastatic Prostate Cancer 



The DNA sequences of the markers of metastatic prostate cancer were determined by 
Sanger dideoxy sequencing as detailed above. The identified sequences are provided in Table 4. 



TABLE 4. 

DNA Sequences of Markers of Metastatic Prostate Cancer: 



UCPB Band #35 (SEQ ID NO: 1) Matches a fetal brain EST, GenBank Accession # T03013 

5'GGCAGGGGCTTGTGACTCTAAGATGGCTTCATTCACATGCCTAGGGCCTCAGTAGG 
ATGACTGGCATGGCCCTGGAAAACTGCGAAGTCTTCTCTCTGTGCAAACTTTCACCT 
GGACTTTTTATATGATTCTGGAAGTATTCCAAGAAGGCAAAAGTAAAAACTGCAAA 
GCGTCTTAAAATAGAAGTTCAGAAGCCACATTATATCACTTCTGTTGCATTCTATCA 
AAGCAAGTCACAAGCCCCTGCCAATCA 3' 



UC Band U 321 (SEQ ID NO:2) previously uncharactcrized Gene 

S'CACACACTCCCCCATTCTGAGCCCCAAGAGGCTCATCCCTAAGGATGTCCAGAGA 
TCCAAGTGCAGAAGGAGAATGTGGTGAGGCTATTTATTCCCCCAGTGCCTTCCCTGC 
TGGGCTATGGATGAACAGTGGCTGACTTCATCTAGGAAAGAGCTATGGCTTCTGTCT 
CCTGGAGCTCACCA 3' 



UC Band # 302 (SEQ ID NO:3) Human Elongation Factor 1-alpha, GenBank Accession 
#X03558 

5'GGTGAGCCCCAGGAGACAGAAGAGATATGAGGAAATTGTTAAGGAAGTCAGCAC 

TTACATTAAGAAAATTGGCTACAACCCCGACACAGTAGCATTTGTGCCAATTTCTGG 

TTGGAATGGTGACAACATGCTGGAGCCAAGTGCTAACATGCCTTGGTTCAAGGGAT 

GGAAAGTCACCCGTAAGGATGGCAATGCCAGTGGAACCACGCTGCTTGAGGCTCTG 

GACTGCATCCTACCACCAACTCGTCCAACTGACAAGCCCTTGCGCCTGCCTCTCCAA 

GGATGTTCTTACAAAATTGGTGGTATTGGTACTGTTCCCTGTTTGGCCGAATTGGAA 

AACTGGTGTTCCTCCA AACCCCGGTTATGGTGGGTTTCCTCCTCCTTGGA 3 ' 

UC Band #325-1 (SEQ IDNO:4) Human IL-8 mRNA, GenBank Accession #Y00787 

5'GGGCGGAACAAGGGAGCGCTAAAAGGAAATTAGGATGTCAGGTGCATAAAGGAC 
ATAATTCCAAAACCTTTCCAAACCCCAAATTTATTCAAAGGAACTGAGGAGTGGATT 
GAGGAGTGGACCAACACTGGCGCCAAACACAGAAATTATTGTAAAGCTTTCTGATG 
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GAAGAGAGCTCTGTCTGGGCCCCAAGGAAAACTGGGTGCAGAGGGTTGTGGAGAAG 

TTTTTCiAAGAGGGCTGAGAATTCATAAAAAAATTCATTCTCTGTGGTATCCAAGAAT 

CAGTGAAGATGCCAGTGAAACTTCAAGCAAATCTACTTCAACACTTCATGTATTGTG 
1 GGGTCTGTTGTAGGGTTGCCAGTTGTT 3' 



UC Band #325-2 (SEQ ID NO:5) Human IL-8 mRNA Containing Intron #3 

5'GCTTGGGCCCCAAGGAAAACTGGGTGCAGAGGGTTGTGGAGAAGTTTTTGAAGAG 
GTAACniATATATTTTTGAATTTAAAATTTGTCATTTATCCGTGAGACATATAATCCA- 
AA(iTC'A(iCCl'ATAAATTTCTTTCTGTTGCTAAAAATCGTCATTAGGTATCTGCCTTTT 
TGGTI AAAAAAAAAAGGAATAGCATCAATAGTGAGTGTGTTGTACTCATGACCAGA 
AAGACCA ! AC ATAGTTTGCCCAGGAAATTCTGGGTTTAAGCTTGTGTCCTATACTCTT 
ACTA A ACT 1 CTTTGTCACTCCCAGTAGTGTCCTATGTTAGATGATAATGTCTTTGATC 
TCCC I AT ri ATAGTTGAGAATATAGAGCATGTCTAACACATGAATGTCAAAGACTAT 
ATI GAG ITnCAAGAACCCTACTTTCCTTCTTATTAAACATAGCTCATCTTTATATTGT 
GAA TIT I AlTTTAGGGCTGAGAATTCATAAAAAAATTCATTCTCTGTGGTATCCAAG 

AA1 C-AG'I (i AAGATGCCAGTGAAACTTCAAGCAAATCTACTTCAACACTTCATGTATT 
GTGTG(i(iTC'TGTTGTAGGGTTGCCA 3' 



Example 5: 

Detection and Differential Diagnosis of BPH versus Localized and 
Advanced Stage Prostate Carcinomas Using 
C ombinations of IL-H with Other Prostate Disease Markers. 

A total of 164 serum specimens from normal men or men with a biopsy confirmed 
diagnosis of BPH or prostate cancer were studied. These serum specimens were provided by Dr. 
George Wriyht from the Virginia Prostate Center at the Eastern Virginia Medical School and by 
Dr. Robert V'essella from the University of Washington or were normal donors from UroCor, 
Inc. All patients were biopsy-confirmed for either BPH or prostate carcinoma (stages A, B, and 
C only) within six months after PSA serum collection and/or a DRE-positive diagnosis. All 
patient sera were obtained prior to any surgical or hormonal therapies. The mean age of the total 
sample was 69.4 ± 8.6 years (range = 37-91 years) old. 

The subset of patients utilized for multivariate diagnostic serum model consisted of 13 
BPH and 64 CaP (Stages A. B, C) cases from the parent population (Marley et ai, 1996). All 
patients in the subset had a total PSA between 2.0 - 20.0 ng/ml, which is a standard range for ill 
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PSA testing (Marley et al. . 1 996). Also evaluated were a subset of Stage D CaP patients, with t- 
PSA values ranging from 6.5 - 867 ng/ml. 



Diagnosis N Mean Age ± Std. Dev. (Range) 

Normal 8 < 50 years 

Bp H 55 66.4±8.6(37-87)years 

CaP Stage A 24 74.7±7.8 (61-91) years 

CaP Stage B 48 68.3±7.9 (5 1 - 85) years 

CaP Stage C 1 4 68.9±6.9 (60 - 80) years 

CaP Stage D 14 72.3±8.6 (58 - 86) years 

Table 5 shows the distribution of the total PSA levels, the f/t PSA ratios, and the UC325 
levels for the 164 patients, broken down by normals, BPH, and Stages A, B, C, & D prostate 
cancer. Only the BPH, Stage A, Stage B, and Stage C prostate cancer patients were included in 
the statistical analysis. 



TABLE 5 

UC325 Patient Sample Characteristics (n = 164) 
Mean Value ± Std. Dev. 



Diagnosis 




UC325 


Total PSA 


f/t PSA 


N 


(pg/ml) 


(ng/ml) 


Ratio (%) 


Normal 


8 


0.2 ± 0.6 


N/A 


N/A 


BPH 


55 


6.8 ±6.1 


6.9 ± 4.0 


21.9 ± 10.9% 


CaP Stage A 


24 


19.1 ± 10.4 


6.2 ±2.7 


14.6 ± 10.5% 


CaP Stage B 


48 


13.5 + 9.5 


8.8 ±6.6 


11.9 ±5.7% 


CaP Stage C 


15 


19.1 ±7.9 


16.2 ±7.6 


11.2 ±8.3 


CaP State D 


14 


78.9 ± 197 


244 ± 332 


12.4 ±7.1% 



Table 6 illustrates the ability for f/t PSA ratio at three different cutoffs to differentiate 
prostate cancer and BPH in the inventors' patient sample. UC325 (IL-8) and t-PSA are analyzed 
at single Classification and Regression Tree (CART) cutoff points for the same outcome. Note 
the significant improvement in both sensitivity and specificity contributed by the UC325 (IL-8) 
serum assay to detect clinically organ confined. The combination of UC325 (IL-8), treated as a 
continuous variable, and t-PSA or f/t PSA ratio provides a highly predictive multivariate test 
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system to diagnose CaP (clinical stages A & B) without any interference provided by BPH in the 
inventors' patient subset. • 

TABLE 6 



Ability of Scrum Tests to Discriminate BPH and CaP. 



Serum Test 


Cutoff 


Sensitivity 


Specificity 


AUC 


p-valuc 


f/t PSA Ratio 

\ 


11% 


52.9% 


91.9% 


0.7905 


< 0.0001 


fin 


14% 


70.1% 


80.0 


nil 


tl ft 


11 tl 


20% 


85.1 


47.3 


llll 


lift 


UC325 


9.8 pg/ml 


72.4% 


74.5% 


0.7973 


<0.0001 


Total PSA 


14.8 ng/ml 


17.2% 


98.2% 


0.5995 


0.0134 


f/t PSA & UC325 


0.69** 


71.3% 


90.9% 


0.8784 


<0.0001 


Total PSA & UC325 


0.64** 


62.1% 


85.5% 


0.8069 


<0.0001 



*A11 cutoffs determined using Classification and Regression Tree Analysis (CART) 
**Predicated Probability value calculated using logistic regression function 



To further substantiate the results of Table 6. individual analysis using Receiver Operator 
Characteristic (ROC) curves are provided for each variable. Figure 2 illustrates the ability of t- 
PSA to distinguish BPH and Stages A, B, and C prostate cancer. Figure 3 shows the ability of f/t 
PSA ratio to distinguish BPH and Stages A, B, and C prostate cancer. Figure 4 shows the ability 
of UC325 (IL-8) alone to distinguish BPH and Stages A. B, and C prostate cancer. Figure 5 
shows the ability of the combination of UC325 (IL-8) and total PSA (t-PSA) to distinguish BPH 
and Stages A, B and C prostate cancer. Figure 6 shows the ability of the combination of UC325 
(IL-8) and the f/t PSA ratio to distinguish between BPH and stages A. B and C prostate cancer. 
It is apparent that the combination of UC325 measurement with either t-PSA or f/T PSA 
provides a significant increase in sensitivity of detection, while maintaining a high degree of 
specificity. Thus, the combination of UC325 (IL-8) with other prostate disease markers, such as 
t-PSA or f/t PSA ratio, provides a significant advance in the detection and differential diagnosis 
of prostate cancer. 
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Table 7 presents the correlation values for the different serum markers. This table clearly 
shows that the UC325 biomarker provides information which is independent of that provided by 
the f/t PSA ratio. 



TABLE 7 

Correlation Values for BPH vs Stages A, B & C (n = 142) 





Diagnosis 


Total PSA 
(ng/ml) 


fit PSA 
Ratio (%) 


UC325 
(pg/ml) 


Age 


Clinical 
Stage 


Diagnosis 


1.0000 


0.5647 


-0.1912 


0.2262 


0.1590 


0.3497 


Total PSA 














(ng/ml) 


0.5647 


1.000 


-0.2319 


0.5991 


0.0898 


0.3729 


f/t PSA 














Ratio (%) 


-0.1912 


0.2319 


1.0000 


-0.2142 


0.0641 


-0.4126 


UC325 














(pg/ml) 


0.2262 


0.5991 


0.2142 


1. 0000 


0.0881 


0.2486 


Age 


0.1590 


0.0898 


0.0641 


0.0881 


1.0000 


0.1372 


Clinical 














Stage 


0.3497 


0.3729 


-0.4126 


0.2486 


0.1372 


1.0000 



Tabic 8 clearly demonstrates a relationship between tumor burden and serum UC-325 
gene product measured by IL-8 assay. Note that as biopsy-confirmed clinical stage of the cancer 
increases, so docs the IL-8 serum marker concentration, whereas the same relationship did not 
occur with [t-PSA] or f/t PSA ratio. 

TABLE 8 



UC325 Culled Dataset, One High and Low Value Removed (n=164) 



Specimen 
Stage 


N 


UC325(10 pm/ml Cutoff) 
Negative Positive 


UC325 (IS pg/ml Cutoff) 
Negative Positive 


Normal 


8 


8(100%) 


0 (0%) 


8(100%) 


0 (0%) 


BPH 


55 


41 (75%) 


14(25%) 


50 (91%) 


5 (9%) 


Stage A & B 


72 


25 (35%) 


47 (65%) 


43 (60%) 


29 (40%) 


Stage C 


15 


0 (0%) 


15(100%) 


5 (33%) 


10(67%) 


Stage D 


14 


2(14%) . 


12(86%) 


3(21%) 


1 1 (79%) 
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Example 6: 

Idcntificationof Markers of Metastatic Prostate and Breast Cancer by Use of 
RNA fingerprinting by PCR primed with oligonucleotides of arbitrary sequence. 

RNA fingerprinting displays PCR™ amplified cDNA fragments that represent a sample 
of RNA species derived from a population of total cell RNAs. When displayed side by side, 
comparisons of similarly produced fingerprints representing RNA populations from cells of 
differing physiologic states identifies mRNA species whose relative abundances vary between 
the examined physiologic states. In this study, RNA fingerprinting identified two cDNA 
fragments derived from mRNA species that had higher steady state abundances in the peripheral 
blood leukocytes of patients with recurrent metastatic prostate cancer as compared to a group of 
healthy volunteers. 

Eight ml of peripheral blood was collected from healthy volunteers, patients with 
clinically and biopsy confirmed BPH, localized and advanced metastatic prostate cancer, and 
from patients with advanced metastatic breast cancer. Metastatic prostate and breast cancer 
patients that had failed a primary therapy and had evidence of recurrence of disease were 
selected. The metastatic prostate cancer patients had high (> 50 ng/ml) serum concentrations of 
PSA. Circulating nucleated peripheral blood cells were separated from erythrocytes by 
centrifugation in Vacutaincr e CPT™ tubes (Becton Dickinson and Company, Franklin Lakes. N 
J). Total RNA was prepared from isolated nucleated peripheral blood cells by lysis with RNA 
Stal-60™ (Tel-Test, Inc.. Friendswood, TX) following the instructions provided by the vendor. 
Contaminating genomic DNA was removed from the total RNAs by digestion with RNase free 
DNasel (GIBCO-BRL. Gaithersburg, MD). For the PCR™ based applications of RNA 
fingerprinting and relative quantitative RT-PCR™, it is absolutely critical that the total RNA is 
completely free of genomic DNA. Typically, 5.0 to 10.0 ug of total RNA was digested with 
20-40 units ofRNa.se free DNasel in 100-200 pi of reaction volume for 20 min at 37°C.' 

Following digestion, the total RNAs were extracted with phenol (pH=4.3, Amresco, Inc., 
Solon. OH) and ethanol precipitated. To confirm that the RNA was free of contaminating 
genomic DNA, 500 ng to 1.0 ug of each DNasel treated RNA was resuspended in water. These 
were used as templates for PCR™ using oligonucleotide primers that anneal to exons 3 and 4 of 
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the gene encoding PSA (exon 3: 5' GCCTCAGGCTGGGGCAGCATT 3' SEQ ID NO:22. exon 
4: 5' GGTCACCTTCTGAGGGTGAACTTGC 3' SEQ ID NO:23). These primers anneal to 
opposite strands of genomic DNA that flank the 145 bp intron 3 of the PSA gene. PCR™ was 
performed at 94°C for 1:15 min, followed by 40 cycles of 94°C for 45 sec, 55°C for 45 sec. and 
72°C for 1:15 min, then a final extension of 72°C for 5:00 min. RNA was considered DNA-free 

✓ if no PCR™ products could be visualized upon gel electrophoresis that co-migrated with the PSA 
gene positive control of known human genomic DNA. If PSA gene products were observed after 
PCR™, the RNA was redigested with DNasel and analyzed again for contaminating genomic 
DNA, After it was confirmed that the RNAs were free of genomic DNA, 500 ng to 1 .0 jig of 
RNA was electrophoresed on a 1.2% agarose Tris Acetate EDTA (TAE) to visualize the 
ribosomal RNAs (Fridell ei ai, 1995). Only RNA preparations for which the 28S ribosomal 
RNA could be visualized were selected for further analysis by RNA fingerprinting and relative 
quantitative RT-PCR™. 

RNA fingerprinting with arbitrarily chosen oligonucleotide primers (Welsh et at, 1992) 
is conceptually similar to differential display (Liang and Pardee, 1992), except that 
oligonucleotides of arbitrary sequence arc used to prime both strands of cDNA synthesis instead 
of just second strand synthesis, as in differential display. In this investigation, the strategy of 
RNA fingerprinting used was similar to that described in Ralph el ai (1993) except that 
oligonucleotide primers used were composed of two discrete domains. The 5' domain of these 
oligonucleotides consisted of ten nucleotides that complemented sequences from either the T7 
promotor or the Ml 3 reverse sequencing primer. The 3' domains of these oligonucleotides were 
8-mer sequences predicted to anneal frequently to the protein-coding regions of mRNAs in a 
permiscuous fashion (Lopez-Nieto and Nigam, 1996). These oligonucleotides were then used in 
a sequential pairwise strategy that optimizes the amount of mRNA complexity that can be 
surveyed with limited numbers of primers and starting RNA. Care was taken to ensure that the 
two oligonucleotides used to produce any single fingerprint did not share sequence similarity in 
either their 5 r or 3' domains. Because these oligonucleotides were constructed of short sequence 

domains that have specific functions within this experimental design, the oligonucleotides arc 

permiscuous rather than truly arbitrary in nature. 
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Two RNA pools were fingerprinted. These two pools were each created by combininc 
equal amounts of peripheral blood total RNA from five individuals. One pool was constructed 
by pooling RNA from five healthy individuals while the other pool was derived from five 
individuals with recurring metastatic prostate cancer. Using the pooled RNAs as templates, first 
strand cDNA synthesis was primed by annealing one of the permiscuous oligonucleotide primers 
to the pooled RNAs at low stringency. All fingerprinting studies were performed in duplicate 
using different initial concentrations of template RNA. The replicate fingerprints were initiated 
by using either 125 ng or 250 ng of RNA as template during first strand cDNA synthesis. 
Reaction conditions for first strand cDNA synthesis were 250 units of Superscript II™ 
(GIBCO-BRL Gaithersburg, MD) in IX supplier' s reaction buffer (25 mM Tris-HCl [pH=8.3], 
37.5 mM KC1, 3.0 mM MgCl 2 ), 10 mM DTT, 400 each dNTP, and 2.0 |iM permiscuous 
oligonucleotide in a 40 jul volume. The latter was incubated for 1 h at 37°C. Following first 
strand cDNA synthesis, the RNA was digested with RNase H and heat inactivated at 70°C as 
directed by the supplier. 

One-tenth (4.0 fal) of the first strand cDNA reaction mixture was used in the 
fingerprinting PCR™ reaction. As many as ten different RNA fingerprints were generated from 
each first strand cDNA reaction. To the first strand cDNA, 36 \xl of a PCR™ mix solution was 
added. The latter contained 50 mM Tris-Cl (pH=8.3), 50 mM KC1, 200 |uM each dNTP, 1.0/jaCi 
°f a33 P-dCTP, 2.0 jiM second permiscuous oligonucleotide and 1.0 unit of recombinant Taq 
DNA polymerase (GIBCO-BRL, Gaithersburg, MD). Note that the concentration of the first 
oligonucleotide is now slightly less that 200 nM. PCR™ fingerprinting was performed with one 
cycle of 94°C for 2:00 min, 48°C for 5:00 min then 72°C for 5:00 min. This was followed by 35 
cycles of 94°C for 45 sec, 48°C for 1:30 min, and 72°C for 2:00 min. A final extension step of 
72°C for 5:00 was performed. Next, 4.0 \i\ of the final PCR™ products were mixed with 6.0 |il 
of sequencing formamide dye solution and denatured by heating to 75°C for 5:00 min. 
Approximately 2.5 ^il of the denatured PCR™ products in formamide dye was electrophoresed 
through a 6% polyacrylamide, 7M urea DNA sequencing gel. PCR™ products were visualized 
by autoradiography. 

The two differentially appearing PCR™ amplified cDNA fragments identified in these 
studies that are the subjects of this report were termed UC331 and UC332. UC331 was 
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identified in a study in which the first permiscuous primer used in the reverse transcription 
reaction had the sequence 5' ACGACTCACTATAAGCAGGA 3' (SEQ ID NO:24). The second 
permiscuous primer that was used in the PCR™ fingerprinting reaction that identified UC331 
was 5' AACAGCTATGACCATCGTGG 3' (SEQ ID NO:25). UC332 was identified in a study 
in which the first permiscuous primer used in the reverse transcription reaction had the sequence 
5' ACGACTCACTATGTGGAGAA 3' (SEQ ID NO:26). The second permiscuous primer that 
was used in the PCR™ fingerprinting reaction that identified UC332 was 5' 
AACAGCTATGACCCTGAGGA 3' (SEQ ID NO:27). After autoradiography, bands that 
appeared differentially in fingerprinting reactions on the pooled total RNAs described above 
were cut out of the gels and reamplified by PCR™. The reamplificd PCR™ products were 
directly sequenced using the Sequenase™ reagent system (Amersham Life Sciences, Inc., 
Arlington Heights, IL.). 

The sequences of UC331 and UC332 were compared to those deposited in release 101 of 
GenBank (July 1997) using the Lasergene™ software package (DNAstar, Inc., Madison, WI). 
The DNA sequence of these cDNA fragments, when compared to the GenBank database, 
revealed that the mRNAs, from which these cDNA fragments were derived, were previously 
uncharacterized. Neither UC331 nor UC332 arc genes whose products have been previously 
characterized as being significant in any physiological pathway, both UC331 and UC332 match 
sequences on the GenBank data base. 

In the case of UC331, these matches are confined to ESTs. UC331 was identical within 
the limits of sequencing accuracy to several human EST sequences. The human EST sequences 
with high similarity to UC331 could be assembled into a virtual contig that predicts the sequence 
of a larger mRNA. The ends of the UC331 contig were then used to requery the EST data base 
whereby more ESTs were identified that extended the contig. This process was continued until 
the UC331 contig predicted a mRNA with an ORF and a poly-A tail. A description of the 
human ESTs that were used to construct the UC331 contig are provided in Table 9. The 
sequence of the UC331 contig and the ORF was identified at its 5' end. A significant feature of 
this contig is that the ORF extends all the way to its 5' end. This indicates that the UC331 
mRNA extends further 5' than is indicated by the contig constructed from the EST database. 
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TABLE 9 
UC331 EST Distribution 
Human 



GB Accession Number 


Tissue 


Library 


AA403120 


Total Fetus 


Soares 


MMhU 1 0*fO 


Total Fetus 


Soares 


MM 1 L IH J o 

AA121262 


Pregnant Uterus 
Pregnant Uterus 


Soares 
Soares 


R22146' 

R30954' 

R31006 j 

R32887 h 

R31390 

R67806 9 

R67807 9 


Placenta 
Placenta 
Placenta 
Placenta 
Placenta 
Placenta 
Placenta 
Placenta 


Soares 
Soares 
Soares 
Soares 
Soares 
Soares 
Soares 
Soares 


AA385620 


Thyroid 


TIGR 


W37985 
W37986 


Parathyroid Tumor 
Parathyroid Tumor 


Soares 
Soares 


AA380401 
A A 182471 

AA181530 


Cell line (Supt) 
Cell line (HeLa) 

Cell line (HeLa) 


TIGR 

Stratagene 

(IMAGE) 

Stratagene 



W31231 

N22701 
N31175 
N34446 
N34538 
N36424 
N36521 
N42854 
N44299 



Senescent Fibroblasts 

Normal Melanocyte 
Normal Melanocyte 
Normal Melanocyte 
Normal Melanocyte 
Normal Melanocyte 
Normal Melanocyte 
Normal Melanocyte 
Normal Melanocyte 



(IMAGE) 

Soares 

Soares 
Soares 
Soares 
Soares 
Soares 
Soares 
Soares 
Soares 
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GB Accession Number Tissue , Library 

W56398 Normal Melanocyte Soares 

N66813 Normal Melanocyte Soares 

AA379996 Skin Tumor TIGR 

AA370040 Prostate Gland TIGR 

AA369851 Prostate Gland TIGR 

H08822 k Brain (Whole infant) Soares 

H08905" Brain (Whole infant) Soares 

HI 9533 Brain (Whole Adult) ' Soares 

H21379' Brain (Whole Adult) Soares 

H21421' Brain (Whole Adult) Soares 

H24360 6 Brain (Whole Adult) Soares 

H25176* Brain (Whole Adult) Soares 

H38689 Brain (Whole Adult) Soares 

H38791 Brain (Whole Adult) Soares 

H39147" Brain (Whole Adult) Soares 

H39148 d Brain (Whole Adult) Soares 

H45092 c Brain (Whole Adult) Soares 

H45054 c Brain (Whole Adult) Soares 

H49928 Brain (Whole Adult) Soares 

H50463 Brain (Whole Adult) Soares 

H51403 a Brain (Whole Adult) Soares 

H51444 3 Brain (Whole Adult) Soares 

H52811 b Brain (Whole Adult) Soares 

H52774" Brain (Whole Adult) Soares 

R85542 Brain (Whole Adult) Soares 

R84652 Brain (Whole Adult) Soares 

AA324855 Brain (Cerebellum) TIGR 

AA317211 Retina TIGR 

AA371911 Pituitary Gland TIGR 

AA302113 Endothelial Cells. Aorta TIGR 

AA247643 Fetal Heart U. Toronto 

W60049 Fetal Heart Soares 

W61359 Fetal Heart Soares 
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GB Accession Number 


Tissue 


Library 


AA243511 


B Cells 


Soares 


AA234769 


Pooled: fetal heart, melanocytes Dreanant uterus 


Snarps 


AA 158239 


Pancreas 


Stratanpnp 

Oil uluUCIIC 






\IIVIHUlJ 


AA 150565 


Pancreas 


Glidiaijcilc 








A A 160836 


Pancrpas 


oil didyenc 






(IMAGE) 


H73822 


Fetal Liver Spleen 


Soares 


N58180 


Fetal Liver Spleen 


Soares 


W04414 


Fptal 1 ivpr Snlppn 
i ciai livci ofjiccu 


ooares 


N94254 


Fetal Liver Snlppn 




N75996 


Fetal 1 ivpr Snlppn 


oudl co 


N69644 


Fetal Liver Snleen 


Qnaroc 

uUol Co 


T83329 


Fetal Liver Snleen 


oudi co 


T72755 


Fetal Liver Spleen 


Soares 


T53976 


Pooled Fetal Spleens 


Soares 


N76701 


Multiple Sclerosis 


Soares 


N90814 


Multiple Sclerosis 


Soares 


N63292 


Multiple Sclerosis 


Soares 


N59233 


Multiple Sclerosis 


Soares 


N53207 


Multiple Sclerosis 


Soares 


1M51545 


Multiple Sclerosis 


Soares 


F22624 


Skeletal Muscle. 


CRIB! (Italy) 



Note:Paired superscripts indicate opposite ends of the same cDNA clone. 

5 When the human UC331 contig was used to query the GcnBank database many mouse 

EST sequences were identified with significant similarity. This was especially true in the region 
spanning the putative ORF. The identified mouse ESTs were found to have areas of overlap and 
similarity with each other that permitted them to be assembled into a mouse UC331 virtual 
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contig in a process that was identical to that used to create the human contig. The mouse UC33 1 
virtual contig was also observed to have an ORF at its 5' end and a poly-A tail at its 3' end. A 
description of the mouse ESTs that were used to construct this contig are provided in Table 10. 



TABLE 10 
Mouse 



ud Accession luiimoer 


Tissue 


Library 


Clone ft 


AA027487 


Placenta 


Soares 


459407 (5') 


AA023708 


Placenta 


Soares 


456984 (5') 


AA023154 


Placenta 


Soares 


456027 (5') 


AA024303 


Placenta 


Soares 


458313(5') 


W35948 


Total Fetus 


Soares 


350258 (5') 


W 11581 


Total Fetus 


Soares 


318665(5') 


W36820 


Total Fetus 


Soares 


336707(5') 


AA002492 


Mouse Embryo 


Soares 


426498 (5') 


AA097370 


Mouse Embryo 


Soares 


493073 (5') 


AA014313 


Mouse Embryo 


Soares 


468491 (5') 


AA450512 


Beddington embryonic region 


IMAGE 


865186(5') 


AA408179 1 


Embryo Ectoplacental Cone 


Ko 


C0025F09 (3') 


AA408261 1 


Embryo Ectoplacental Cone 


Ko 


C0025F09 (5') 


AA1 17174 


T-cells 


Stratagene 


558134(5') 


AA1 19346 


Thymus 


Soares 


573567 (5') 


AA183195 


Lymph Node 


Soares 


636222(5') 


AA1 22933 


Kidney 


Barstead 


579415(5') 


AA423613 


Mammary Gland 


Soares 


832219(5') 



Note:Paired superscripts indicate opposite ends of the same cDNA clone. 



When the MegAlign™ program of the Lasergene™ DNA analysis software package 
(DNAstar, Inc.) was used to compare the mouse and human UC33 1 contigs, the two contigs were 
predicted to represent mRNA species that were highly similar and nearly collinear throughout 
their lengths. This similarity was most striking in the region comprising the putative ORFs. 
Within the ORFs the mouse and human contigs, the DNA sequences are 89% identical. In the 
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predicted 3' untranslated regions of the two contigs, the DNA sequence similarity falls to 73% 
with several small deletions and insertions. This higher degree of sequence similarity in the 
putative ORFs as compared to the proposed 3' untranslated region is interpreted as evidence that 
the ORFs encode proteins on which natural selection constrains amino acid sequence divergence. 
Like the human UC33 1 contig, the mouse contig also encodes a putative ORF that extends all the 
way to its 5' end. This provides additional support for the contention that the UC331 mRNA 
contains more sequences at its 5' end than are represented by the EST based contigs presented^ 
here. 

! he ORFs of the mouse and human UC331 contigs were conceptually translated and the 
amino acid sequences were compared. The amino acid sequence of the human UC331 ORF was 
used to query the Swiss, PIR and Translation release 101 using the Lasergene™ software 
package. I or the 157 amino acids for which this comparison is possible, the mouse and human 
sequences arc collinear and identical at 151 positions (96%) with five of the six differences being 
conservative substitutions. This putative protein domain is highly acidic with 26 acidic and 17 
basic amino acids. There were also 48 hydrophobic and 41 polar amino acids predicted. When 
either the predicted mouse or human UC331 amino acid sequences was compared to amino acid 
sequences in the public protein sequence data bases, no significant matches were found to any 
previously characterized vertebrate proteins. However, a significant match was observed to a 
putative protein, termed ZK353.1 (PIR Accession number S44654), encoded in the genome of 
the nematode, Caenorhabdiiis elegans. The mammalian amino acid sequence is similar and 
collinear with the C-terminal 157 amino acids of the putative C. clegans protein. Like the 
mammalian UC331 amino acid sequences, the C-terminal 157 amino acid sequence of the 
ZK353. 1 is also highly acidic with 3 1 acidic and only 20 basic amino acids. Over the 203 amino 
acids for which a comparison can be made the ZK353.1 amino acid sequence is identical to the 
human or mouse sequence at 84 (41%) positions with many of the differences representing 
conservative substitutions. 

The putative C. elegans protein, ZK353.K has no currently known function. Its existence 
is predicted from the C. elegans genome sequencing effort (Sulston et ai y 1992). The 
polypeptide sequence for ZK353.1 is a conceptual translation of an area on the C elegans 
chromosome III (GB accession number CELZK353). The predicted sequence for ZK353.1 is 



WO 98/24935 PCTYUS97/22105 

106 

548 amino acids long and includes an additional 371 amino acids that are N-terminal of the 
domain with similarity to the predicted amino acid sequence of UC331. If UC331 is the 
mammalian homolog of ZK353.1 and if UC331 is collinear with the C. elegans protein over its 
entire length, it could be expected that the ORF of UC331 would extend roughly an additional 
1 100 nucleotides 5' of the sequence in SEQ ID NO:29. While it is likely that the UC331 ORF 
extends further 5' than is accounted for in the virtual mouse and human UC331 contigs. Northern 
blot data from human poly-A plus RNA discussed below indicates that the human UC33I 
mRNA extends only about 350 nucleotides further 5'. This may indicate an error in interpreting 
the possible pattern of 'mRNA processing from the C. elegans sequence or indicate simply that 
the mammalian and nematode mRNAs and encoded proteins are significantly different from each 
other at their 5' and N-terminal ends respectively. 

To confirm that the human UC331 virtual contig accurately represented the sequence of 
an authentic mRNA, oligonucleotides were designed to direct the PCR™ amplification of large 
cDNA fragments predicted to be continuous from the virtual contig but which contain 
significantly more sequence than can be found in any single EST. 

UC332 did not match any EST sequences but was identical to a portion of a previously 
sequenced full length cDNA with a GenBank accession number of D87451. 

RELA TIVE QUANTITA TIVE R T-PCR™ 

Frequently, mRNAs identified by RNA fingerprinting or differential display as being 
differentially regulated turn out not to be so when examined by independent means. It is, 
therefore, critical that the differential expression of all mRNAs identified by RNA fingerprinting 
be confirmed as such by an independent methodology. To independently confirm the differential 
expression of UC331 in the peripheral blood of patients with recurrent metastatic cancer 
compared to the peripheral blood of healthy volunteers, two different formats for a relative 
quantitative RT-PCR™ were performed. The first format of this assay examined normalized 
pools of cDNA constructed by combining equal amounts of cDNA from various individuals 
representing similar physiologic states. In this study, a cDNA pool representing 8 healthy 
volunteers was compared to a pool representing 10 individuals with recurrent metastatic prostate 
cancer. A third pool representing 10 individuals with recurrent metastatic breast cancer was also 
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examined. The inclusion of the breast cancer patient samples in this study was made to 
determine if the mRNAs examined were being differentially regulated in the immune system in a 
response that was specific for prostate cancer or if the response was more general to metastatic 
cancer in general. Using these pools of cDNA as templates, triplicate PCR™ was performed. 
Each of the three replicates were terminated at a different cycle number of PCR™. This format 
of relative quantitative RT-PCR™ insures that the results taken for relative quantitation represent 
the PCRs™ when they are in the log linear portions of their amplification curves where such 
quantitation is most accurate. 

Approximately 1.5-5.0 ug of DNA-free total RNA from the peripheral blood of healthy 
volunteers or patients with either metastatic prostate or breast cancer were converted into first 
strand cDNA using the Superscript™ Preamplification System for First Strand cDNA Synthesis 
(GIBCO-BRL, Cat# 18089-011) following the directions provided by the supplier. These 
cDNAs were then normalized to contain equal concentrations of amplifiablc cDNA by PCR™ 
amplification of P-actin cDNA using the primers 5' GGAGCTGCCTGACGGCCAGGTCATC 3' 
(SEQ ID NO:28) and 5' GAAGCATTTGCGGTGGACGATGGAG 3' (SEQ ID NO:9). A 
typical PCR™ program would be 94°C for 1:15 min, followed by 22 cycles of 94 °C for 45 sec, 
55°C for 45 sec and 72°C for 1:15 min. This was followed by final extension of 72°C for 5:00 
min. PCR™ products were visualized by gel electrophoresis through 1.5% agarose TAE gels 
stained with ethidium bromide. Images of the gels were captured, digitized and analyzed using 
the IS-1000 Digital Imaging System (Alpha lnnotech Corp.). The concentrations of the cDNAs 
were adjusted by adding various amounts of water to create cDNA stocks that contained equal 
concentrations of amplifiabie p-actin cDNA. Typically, the cDNA derived from the reverse 
transcription of 5.0 ug of RNA resulted in enough normalized cDNA to perform 50-200 
RT-PCR™ reactions. 

Equal amounts of the normalized cDNA stock from individuals having the same disease 
state were pooled. Pools of cDNAs from healthy volunteers, patients with metastatic prostate 
cancer and metastatic breast cancer were produced. These pools were then examined by PCR™ 
for P-actin to determine that they contained equal amounts of amplifiabie cDNA. 

To demonstrate that all observations were made in the log-linear phase of the PCR™ 
amplification curve, a series of PCR™ reactions using different cycle number were performed on 
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each cDNA pool for each gene (primer pair) examined. Display of the PCR™ products on 
elcctrophoretic gels and analysis with the IS 1000 Digital Imaging System illustrates that the 
mass of the PCR™ products is increased exponentially with increasing cycle number, confirming 
thai the observed results are in the log-linear portion of the PCR™ amplification curve. 

Relative quantitative RT-PCR™ showing near equal amounts of amplifiable p-actin 
cDNA in three pools cDNA. Pools of normalized cDNAs were constructed from peripheral 
blood RNAs from eight healthy volunteers, ten individuals with recurrent metastatic prostate 
cancer, or ten individuals with recurrent - metastatic breast cancer. Three separate PCR™ 
reactions were performed on each pool of cDNA. PCR™ was terminated at differing cycle 
numbers (cycle 22, cycle 24, and cycle 26), and the products were visualized by electrophoreses 
and elhidiumn bromide staining. Images were captured and quantitated using a digital image 
analysis system. At all three cycle numbers examined, there are relatively similar band 
intensities representing the three cDNA pools and increasing band intensity with increasing cycle 
number, verifying that the observations are being made in the log linear range of the 
amplification curves. Similar band intensities indicate similar relative concentrations of p-actin 
mRNA in the RNAs from individuals from which these cDNA pools were constructed. 

The oligonucleotides used in the relative quantitative RT-PCR™ studies that 
independently confirmed the differential expression of UC331 were designed from the sequence 
in the human UC331 virtual contig. These UC331 specific oligonucleotides had the sequences of 
5' CTGGCCTACGGAAGATACGACAC 3' (SEQ ID NO:31) and .5' 
ACAA1 CCGGAGGC ATC AGAAACT 3' (SEQ ID NO:32). These oligonucleotides direct the 
amplification of a 277 nucleotide long PCR™ product that is specific for UC331. The 
oligonucleotides used in the relative quantitative RT-PCR™ studies that independently 
confirmed the differential expression of UC332 were designed using the sequences of the cDNA 
with the GenBank accession number D87451. These UC332 specific oligonucleotides had the 
sequences 5' AGCCCCGGCCTCCTCGTCCTC 3' (SEQ ID NO:33) and 5' 
GGCGGCG G C AGCGGTTCTC 3' (SEQ ID NO:34). These oligonucleotides direct the 
amplification of a 140 nucleotide long PCR™ product that is specific for UC332. 

The results for relative levels of p-actin expression contrasts sharply with those observed 
when oligonucleotide primers specific for UC331 were used, to direct PCR™ amplification (FIG. 



WO 98/24935 PCI7US97/22105 

109 

7). At 25 cycles of PCR™, clear bands are visible in the lanes representing the pools of cDNA 
from peripheral blood of patients with either metastatic breast or prostate cancer. In the lane 
representing the peripheral blood of healthy volunteers, only a very faint band is present. At 28 
cycles of PGR™, the band intensities representing all three pools are brighter than they were at 
25 cycles, but the relative increase in intensity of the bands representing the metastatic cancer 
patient pools compared to the healthy volunteers remains the same as was observed at 25 cycles 
of PCR™. This indicates that these observations are being made in the log linear range of the 
PCR™ amplification curves. At 31 cycles of PCR™, there is still an increase in the intensity of 
the bands representing the pools of metastatic cancer patients compared to the pool representing 
the healthy volunteers, but a quantitative analysis of these bands indicates that the PCRs™ have 
left the log linear range of their amplification curves. Quantitation of the data for 25 and 28 
cycles of PCR™ independently confirms that UC331 mRNA is differentially regulated and is 
roughly seven fold more abundant in the peripheral blood leukocytes of the average patient with 
either recurrent metastatic prostate cancer or breast cancer than in the peripheral blood 
leukocytes of healthy volunteers. 

The second format of relative quantitative RT-PCR™ used to examine the differential 
expression of UC331 examined the relative abundance of UC331 mRNA in the peripheral blood 
of healthy individuals or individuals with recurrent metastatic cancer. The individuals examined 
in this study were the same as those whose cDNAs were combined to construct the pools 
examined as described above. Using the information obtained from the pooled cDNA study to 
predict at what PCR™ cycle numbers relative quantitative RT-PCR™ would be most 
informative, these individuals were examined for the relative abundance of p-actin and UC331 
mRNAs present in their peripheral blood leukocytes. PCR™ was for 22 cycles. All individuals 
examined contain roughly equal amounts of amplifiable P-actin cDNA. Some of the differences 
in P-actin band intensity observed in this study are probably due to the internal variation inherent 
of this study. Results from studies designed to quantitate this internal variation indicate that 
identical replicates of a p-actin PCR™ can be expected to vary in the intensity of product bands 
with a standard deviation of ±15%. 

Relative quantitative RT-PCR™ of UC331 cDNA was conducted using reverse 
transcribed from RNA isolated from the peripheral blood of eight healthy volunteers (group N), 
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ten individuals with recurrent methstatic prostate cancer (group P), or ten individuals with 
recurrent metastatic breast cancer (group B). PCR™ was for 30 cycles. As was seen in the study 
using the pooled cDNAs, the results of the relative quantitative RT-PCR™ for UC331 using 
cDNA from individuals contrasts sharply with that observed for P-actin. The intensity of the 
5 band representing the abundance of the UC331 mRNA in peripheral blood leukocytes was 
greater for all of the patients with either metastatic prostate or breast cancer as compared to the 
intensity of the UC331 band representing the mRNA level in the peripheral blood leukocytes of 
healthy volunteers. Therefore, the elevated UC331 mRNA levels indicated by the relative 
quantitative RT-PCR™ results using the pooled cDNA templates was caused by an elevated 

10 mRNA level in all individuals comprising the pools and not from a subset of individuals with 
very high elevations in UC331 mRNA levels. This study is a second independent confirmation 
of the differential expression of the UC33 1 mRNA. 

As is indicated by the wide distribution of tissues from which the ESTs used to assemble 
the UC331 contigs (Table 9), UC331 is widely expressed in many tissue and cell types. 

15 However, because most of ESTs comprising UC331 are from normalized libraries, little 
information can be gained from this data on the relative abundance of the UC331 mRNA in 
different tissues. Also, while the extension of the ORFs of the mouse and human UC33 1 contigs 
all the way to their 5' ends and the similarity of mammalian UC331 mRNAs to a much larger 
putative C elegans mRNA both predict that the mammalian UC331 mRNA extends even further 

20 5', the exact size of the UC331 mRNA was unknown. To address all of these issues, a Northern 
blot of poly-A plus RNA from eight different human tissues was probed with the 850 nucleotide 
long RT-PCR™ product described above labeled with 32 P. Approximately 2.0 ^g of poly-A plus 
RNA from spleen, thymus, prostate, testis, ovary, small intestine, colon, and peripheral blood 
leukocytes were loaded in each lane. UC331 mRNA is expressed in all eight human tissue and 

25 cell types. Size standards indicate a message size of approximately 1.75 kb. Interestingly, 
UC331 is least abundant in peripheral blood leukocytes but is highly expressed in the thymus, 
demonstrating a difference in expression between cells of different developmental stages in the 
immune system. UC331 is most abundantly expressed in the testes. The UC331 mRNA is about 
1 .75 kb which indicated that the mRNA only extends about 350 nucleotides further 5' than is 

30 accounted for by t the virtual contig shown in SEQ ID NO:29. The translation product of the 
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virtual contig is shown in SEQ ID NO:30. Clearly, the putative C elegans mRNA extends much 
more 5' than do the mammalian mRNA species. 

The other gene identified as being differentially regulated in this RNA fingerprinting 
study was UC332. UC332 was analyzed in much the same way as UC331 was. When the 
sequence of the cDNA fragment from the RNA fingerprinting gel representing UC332 was used 
to query GenBank, no ESTs were identified. The sequence of the UC332 cDNA fragment did, 
however, identify a sequence of a full length cDNA, KA000262 (GBraccession number 
D87451). The sequence of KA000262, (hereafter referred to interchangeably with the name, 
UC332) was determined as part of a project to examine previously unidentified mRNAs 
expressed in the bone marrow myeloblast cell line, KG-1 (Nagase et al, 1996). This mRNA 
contains an ORF encoding a putative protein with 761 amino acid sequence. Perhaps the most 
striking feature of this polypeptide sequence is the appearance of a C3HC4 RING zinc finger or 
RING finger motif (Freemont, 1993) located between amino acids 175 and 216. The RING 
finger domain binds two zinc ions in a conserved structure that has been resolved (Barlow et al, 
1994). RING finger domains have been identified in dozens of proteins derived from eukaryotes 
as diverse as yeasts, flies, birds, nematodes and humans. In most of these cases, the RING finger 
containing proteins have been shown to be essential for some important biological process 
although the these processes vary considerably one from another. Among these mammalian 
encoded RING finger proteins are several genes implicated in the ontogeny of cancer including 
the ret viral oncogene (Takahashi et al, 1988) and bmi-1, a gene whose product collaborates 
with myc induced transformation (Haupt et al. 1991). The BRCA-1 tumor suppressor gene 
involved in hereditary breast and ovarian cancer susceptibility contains a RING finger domain 
(Miki et al, 1994), and MAT- 1, a novel 36 kDa RING finger protein, is required for the 
assembly of enzymatically active CDK7- cyclin H complexes (Tassan et al, 1995). A 
comparison of the RING finger domains of UC332 and various representative members of this 
group, including BRAC1, rpt-1, Traf5, HT2A, MAT1, rfp, bmi-1, CRZF, and neu. indicates the 
RING finger domain of UC332 is slightly more similar to those found in the tumor suppressor 
gene, BRCA1, and the T cell repressor of transcription protein, rpt-1. However, BRCA1 and 
rpt-1 are more similar to each other than they are to UC332. 
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. Proteins with RING finger motifs exhibit heterogeneity in their subcelluar localizations. 
Some, that arc important regulators of differential gene regulation, localize to the cell nucleus. 
When the amino acid sequence of UC332 was scanned for evidence of subcellular localization, 
two domains were identified that contained sequences for putative nuclear localization signals 
(NLS). NLS are highly basic stretches of six are more amino acids of which at least four are 
basic that tend to be flanked by acidic amino acids and/or prolines (Boulikas, 1994). Both of the 
putative NLS in UC332 longer and more basic than the minimum requirements for the consensus" 
NLS motif. The first of these putative NLS motifs occurs between amino acid 548 and 567. 
Within this domain, 13 of 19 amino acids are basic. In fact, this domain could be viewed as two 
NLS in tandem separated by two glutamic acid residues. If divided this way, the first NLS 
domain would have 8 of eleven positions as basic amino acids while the second motif would 
have 5 of 6 amino acids being basic. The second NLS motif in UC332 is located near the 
C-terminal end between positions 739 and 750 in the amino acid sequence. This domain has 8 of 
12 amino acids as basic residues with a core of 5 consecutive lysines and arginines. The 
presence of these putative NLS in the amino acid sequence of UC332 suggest the possibility that 
UC332 plays an important role in regulating the expression of other genes. Finally, the amino 
acid sequence of UC332 lacks a signal sequence for cellular export or an obvious hydrophobic 
transmembrane domains. 

To independently verify that UC332 mRNA is more abundant in the peripheral blood 
leukocytes of patients with recurrent metastatic cancer as compared to the peripheral blood 
leukocytes of healthy volunteers, relative quantitative RT-PCR™ was performed using the same 
cDNAs and formats as were used to investigate the differential regulation of UC331. A relative 
quantitative RT-PCR™ study using UC332 specific oligonucleotide primers and cDNA pools as 
templates was conducted. At 25 and 28 cycles of PCR™, the amplified DNA band representing 
the relative abundance of the UC332 mRNA is stained more intensely for those reactions that 
used cDNA template pools constructed from the peripheral blood leukocyte RNA isolated from 
metastatic prostate and breast cancer patients as compared to a similar pool constructed from 
RNA from healthy volunteers. Quantitation of this image using the IS- 1000 Digital Imaging 
System (Alpha Innotech, Inc.) indicates that UC332 mRNA is roughly 5 times more abundant in 
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the peripheral blood leukocytes of metastatic cancer patients compared to healthy volunteers. At 
3 1 cycles ofPCR™, the reactions have left the log linear range of their amplification curves. 

In a second relative quantitative RT-PCR™ study using UC332 specific oligonucleotide 
primers, peripheral blood leukocyte cDNA from the individuals that comprised the pools from 
the peripheral blood of eight healthy volunteers, ten individuals with recurrent metastatic prostate 
cancer, or ten individuals with recurrent metastatic breast cancer were examined separately. 
PCR™ was for 26 cycles. The results of this study are similar to those obtained when the pooled 
cDNAs were used as PCR™ templates. All of the cancer patients had higher levels of UC332 
mRNA in their peripheral blood leukocytes than did any of the healthy volunteers. 

In this study, the inventors showed that UC332, encoding a RING finger protein, is up 
regulated in the peripheral blood leukocytes of patients with either recurrent metastatic breast or 
prostate cancer. From the literature, RING finger proteins have been shown to participate in the 
regulation of several important lymphocytic processes (Paiarca el ai, 1988; Fridell et a!., 1995; 
Takeuchi et a!.. 1996; van Arsdale el a/.. 1997; Nakano el ai, 1996). The observed differential 
regulation of the RING protein encoding mRNA, UC332, in the immune response of patients 
with metastatic breast or prostate cancer strongly suggests that UC332 participates in regulating 
•this immune response. 

All of the compositions and methods disclosed and claimed herein may be made and 
executed without undue experimentation in light of the present disclosure. While the compositions 
and methods of this disclosure have been described in terms of preferred embodiments, it is 
apparent that variations may be applied to the composition, methods and in the steps or in the 
sequence of steps of the method described herein without departing from the concept, spirit and 
scope of the invention. 

More specifically, it is apparent that certain agents which are both chemically and 
physiologically related may be substituted for the agents described herein while the same or similar 
results would be achieved. All such similar substitutes and modifications apparent to those skilled 
in the art are deemed to be within the spirit, scope and concept of the disclosure as defined by the 
appended claims. 

UC325-1 is derived from the IL-8 gene (Genebank Accession #M28130). UC325-1 and 
UC325-2, an alternatively spliced form that includes the third intron of the IL-8 primary transcript, 
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are transcribed from the IL-8 gene. Our definition of IL-8 gene products means all mRNAs 
transcribed from the IL-8 gene, the polypeptides encoded by those mRNAs and their post- 
translationally processed protein products. 

Those practiced in the art will realize that there exists naturally occurring genetic 
5 variation between individuals. As a result, some individuals may synthesize IL-8 gene products 
that differ from those described by the sequences entailed in the Gencbank number listed above. 
We include in our definition of IL-8, those products encoded by IL-8 genes that vary in sequence 
from those described above. Those practiced in the art will realize that modest variations in DNA 
sequence will not significantly obscure the identity of a gene product as being derived from the 
10 IL-8 gene. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(l) APPLICANT; 

(A) NAME: UROCOR, Inc. 

(B) STREET: 800 Research Parkway 

(C) CITY: Oklahoma City 

(D) STATE: Oklahoma 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 73104 

(ii) TITLE OF INVENTION: DIAGNOSIS OF DISEASE STATE USING mRNA 
PROFILES 

(iii) NUMBER OF SEQUENCES: 34 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0 , Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 253 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GGCAGGGGCT TGTGACTCTA AGATGGCTTC ATTCACATGC CTAGGGCCTC AGTAGGATGA 60 

CTGGCATGGC CCTGGAAAAC TGCGAAGTCT TCTCTCTGTG CAAACTTTCA CCTGGACTTT 120 

TTATATGATT CTGGAAGTAT TCCAAGAAGG CAAAAGTAAA AACTGCAAAG CGTCTTAAAA 180 

TAGAAGTTCA GAAGCCACAT TATATCACTT CTGTTGCATT CTATCAAAGC AAGTCACAAG 24 0 

CCCCTGCCAA TCA 253 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 183 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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CACACACTCC CCCATTCTGA GCCCCAAGAG GCTCATCCCT AAGGATGTCC AGAGATCCAA 



60 



GTGCAGAAGG AGAATGTGGT GAGGCTATTT ATTCCCCCAG TGCCTTCCCT GCTGGGCTAT 



i20 



GGATGAACAG TGGCTGACTT CATCTAGGAA AGAGCTATGG CTTCTGTCTC CTGGAGCTCA 



180 



CCA 



183 



(2) INFORMATION FOR SEQ ID NO : 3: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGTGAGCCCC AGGAGACAGA AGAGATATGA GGAAATTGTT AAGGAAGTCA GCACTTACAT 60 

TAAGAAAATT GGCTACAACC CCGACACAGT AG CATTTGTG CCAATTTCTG GTTGGAATGG 120 

TGACAACATG CTGGAGCCAA GTGCTAACAT GCCTTGGTTC AAGGGATGGA AAGTCACCCG 180 

TAAGGATGGC AATGCCAGTG GAACCACGCT GCTTGAGGCT CTGGACTGCA TCCTACCACC 240 

AACTCGTCCA ACTGACAAGC CCTTGCGCCT GCCTCTCCAA GGATGTTCTT ACAAAATTGG 300 

TGGTATTGGT ACTGTTCCCT GTTTGGCCGA ATTG G AAAAC TGGTGTTCCT CCAAACCCCG 360 

GTTATGGTGG GTTTCCTCCT CCTTGGA 387 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 base pairs 

(B ) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 

GGGCGGAACA AGGGAGCGCT AAAAGGAAAT TAGGATGTCA GGTG CAT AAA GGAACATAAT 60 

TCCAAAACCT TTCCAAACCC CAAATTTATT CAAAGGAACT GAGGAGTGGA TTGAGGAGTG 120 

GACCAACACT GGCGCCAAAC ACAGAAATTA TTGTAAAGCT TTCTGATGGA AGAGAGCTCT 180 

GTCTGGGCCC CAAGG AAAAC TGGGTGCAGA GGGTTGTGGA GAAGTTTTTG AAGAGGGCTG 240 



AGAATTCATA AAAAAATTCA TTCTCTGTGG TATCCAAGAA TCAGTGAAGA TGCCAGTGAA 



300 
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ACTTCAAGCA AATCTACTTC AACACTTCAT GTATTGTGTG GGTCTGTTGT AGGGTTGCCA 360 
GTTGTT 366 



(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 598 base pairs 

(B) .TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 5: 

GCTTGGGCCC CAAGGAAAAC TGGGTGCAGA GGGTTGTGGA GAAGTTTTTG AAGAGGTAAG 60 

TTATATATTT TTGAATTTAA AATTTGTCAT TTATCCGTGA GACATATAAT CCAAAGTCAG 120 

CCTATAAATT TCTTTCTGTT GCTAAAAATC GTCATTAGGT ATCTGCCTTT TTGGTTAAAA 180 

AAAAAAGGAA TAGCATCAAT AGTGAGTGTG TTGTACTCAT GACCAGAAAG ACCATACATA 24 0 

GTTTGCCCAG GAAATTCTGG GTTTAAGCTT GTGTCCTATA CTCTTAGTAA AGTTCTTTGT 300 

CACTCCCAGT AGTGTCCTAT GTTAGATGAT AATGTCTTTG ATCTCCCTAT TTATAGTTGA 360 

GAATATAGAG CATGTCTAAC ACATGAATGT CAAAGACTAT ATTGACTTTT CAAGAACCCT 42 0 

ACTTTCCTTC TTATTAAACA TAGCTCATCT TTATATTGTG AATTTTATTT TAGGGCTGAG 480 

AATTCATAAA AAAATTCATT CTCTGTGGTA TCCAAGAATC AGTGAAGATG CCAGTGAAAC 54 0 

TTCAAGCAAA TCTACTTCAA CACTTCATGT ATTGTGTGGG TCTGTTGTAG GGTTGCCA 5 98 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: 
CGCCTCAGGC TGGGGCAGCA TT 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION": SEQ ID NO: 7: 
ACAGTGGAAG AGTCTCATTC GAGAT 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CGACCTGCCT GACGGCCAGG TCATC 



(2) INFORMATION FOR SEQ ID NO : 9: 

(J) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GAAGCATTTG CGGTGGACGA TGGAG 



(2) INFORMATION FOR SEQ ID NO: 10: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TGCAAACTTT CACCTGGACT T 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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CTTGTGACTT GCTTTGATAG AATG 24 



(2) INFORMATION FOR SEQ ID NO: 12: 

( i ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 23 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GACAACATGC TGGAGCCAAG TGC 23 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ACCACCAATT TTGTAAGAAC ATCCT - 2 5 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TGTCCAGAGA TCCAAGTGCA GAAGG 25 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



GAGCTCCAGG AGACAGAAGC CATAG 



25 
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(2) INFORMATION FOR SEQ ID NO; 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGGCCCCAAG GAAAACT 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGGCAACCCT ACAACAGAC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GGGCCCCAAG GAAAACT 17 



(2) INFORMATION FOR SEQ ID NO : 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : v SEQ ID NO: 19: 
TGGCAACCCT ACAACAGACC 



(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
ACATTGAAGC ACTCCGCGAC 



(2) INFORMATION FOR SEQ ID NO: 21: 

(l) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
AGAGTGGCAG CAACCAAGCT 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GCCTCAGGCT GGGGCAGCAT T 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GGTCACCTTC TGAGGGTGAA CTTGC 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 
<D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ACGACTCACT ATAAGCAGGA 2 0 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single* 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 25: 
AACAGCTATG ACCATCGTGG 20 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 
ACGACTCACT ATGTGGAGAA 20 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
AACAGCTATG ACCCTGAGGA 20 



(2) INFORMATION FOR SEQ ID NO : 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GGAGCTGCCT GACGGCCAGG TCATC 



(2) INFORMATION FOR SEQ ID NO : 29: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1599 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 115. .744 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GCGGCAGGCG CGGCAAATTA CGTTGCCGGA GCTGAACGGC GCGGCTGGTC TGAAGGCAAA 60 

CAAGCGAGCG AGCGCGCGAT AGGGGCCGAG AGGACGCGCA GGTGGCGGCG TTGC ATG 117 

Met 
1 

TCG CAC GGT CAC AGC CAC GGA ATG GGT GAC TGC CGC TGC GCC GCC GAA 165 
Ser His Gly His Ser His Gly Met Gly Asp Cys Arg Cys Ala Ala Glu 
5 10 15 

CGG GAG GAG CCG CCC GAG CAG CAC GCC ATG GCT ACG CTG TAC CTG CGC 213 
Arg Glu Glu Pro Pro Glu Gin His Ala Met Ala Thr Leu Tyr Leu Arg 
20 25 30 

ATC GAC CTG GAG CGG CTG CAA TGC CTT AAC GAG AGC CGC GAG GGC AGC 261 
lie Asp Leu Glu Arg Leu Gin Cys Leu Asn Glu Ser Arg Glu Gly Ser 
35 40 45 

GGC CGC GGC GTC TTC AAG CCG TGG GAG GAG CGG ACC GAC CGC TCC AAG 309 
Gly Arg Gly Val Phe Lys Pro Trp Glu Glu Arg Thr Asp Arg Ser Lys 
50 55 60 65 

TTT GTT GAA AGT GAT GCA GAT GAA GAG CTT CTG TTT AAT ATT CCA TTT 357 
Phe Val Glu Ser Asp Ala Asp Glu Glu Leu Leu Phe Asn He Pro Phe 
70 75 80 

ACG GGC AAT GTC AAG CTC AAA GGC ATC ATT ATA ATG GGA GAG GAT GAT 405 
Thr Gly Asn Val Lys Leu Lys Gly He He He Met Gly Glu Asp Asp 
85 90 95 

GAC TCA CAC CCC TCT GAG ATG AGA CTG TAC AAG AAT ATT CCA CAG ATG 453 
Asp Ser His Pro Ser Glu Met Arg Leu Tyr Lys Asn He Pro Gin Met 
100 105 no 

TCC TTT GAT GAT ACA GAA AGG GAG CCA GAT CAG ACC TTT AGT CTG AAC 501 
Ser Phe Asp Asp Thr Glu Arg Glu Pro Asp Gin Thr Phe Ser Leu Asn 
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115 120 125 

CGG GAT CTT ACA GGA GAA TTA GAG TAT GCT ACA AAA ATT TCT CGT TTT 54 9 

Arg Asp Leu Thr Gly Glu Leu Glu Tyr Ala Thr Lys lie Ser Arg Phe 

130 135 140 145 



TCA AAT GTC TAT CAT CTC TCA ATT CAT ATT TCA AAA AAC TTC GGA GCA 
Ser Asn Val Tyr His Leu Ser He His He Ser Lys Asn Phe Gly Ala 
150 155 160 

GAT ACG ACA AAG GTC TTT TAT ATT GGC CTG AGA GGA GAG TGG ACT GAG 
Asp Thr Thr Lys Val Phe Tyr He Gly Leu Arg Gly Glu Trp Thr Glu 
165 170 175 



597 



645 



CTT CGC CGA CAC GAG GTG ACC ATC TGC AAT TAC GAA GCA TCT GCC AAC 693 
Leu Arg Arg His Glu Val Thr He Cys Asn Tyr Glu Ala Ser Ala Asn 
180 185 190 

CCA GCA GAC CAT AGG GTC CAT CAG GTT ACC CCA CAG ACA CAC TTT ATT 741 
Pro Ala Asp His Arg Val His Gin Val Thr Pro Gin Thr His Phe He 
195 200 205 

TCC TAAGGGCTGG CCAAGGCTCC CAT AG AGG CG CTGTGTCAGT GAAGATGTAC 794 

Ser 

210 

GACTACCTGT TGGGAAGGAC AAAGGGATGA GGCTCCAGAG AGAGTTGGCT GCCACAGCTC 854 

TGCCAAGCTT TGTCTTTGGG GCTTGCTGCA GAAACCTGGC CTACGGAAGA TACGACACCA 914 

CTGGGAGGGT TGTGTAGGTG CCAGGGGACC ATCGTGGTTC TCTAGGGCGC TGTGGAAATT 974 

GGGTCTTGGG CTGGGTGGCA TCTGGCAGTC ATGGGTAACA CTTGCTTTTC CAGTTAATGT 1034 

GGCCATGTGA TTCCAAGTGT CATGTTGCTT TGTGGAAGAT TGTTGTGTGA CTTGTTTTTT 1094 

TGATTTTGTA TTTGTTTTTT TAAAGGAAAC TATTTGTGGG CTATAGGAAA CTTTCTGATG 1154 

CCTCCGGATT GTGTTAGTAG TAGCCATCAG GAGGGTCTCC AACTAAAACA CTTGTTCCTG 1214 

CTTGCTCCTT TCCCCTCTCA TTGTTCAGCA TTCTTGTCAA GTTGCCCAGC TTGGAGTTGT 1274 

CTGTCACGCA CATGTGTCCT GTGGTTATAG CTAGAAGGAC AGGAGTCTCC TGCTGATG CG 1334 

TG AT AG CTT A AGCTTGGGGA GAAGGTCTTT TCCACTGCCT AGCTAAGCAG TCTGGGGAGA 13 94 

GCATGGGGAT CATTTCTATG TGTGTGGGTA ATCTGGTCAG TAAGATTGAG ACTTAGTTAA 14 54 

GATTCCCCTT GGAAATTCCT TAATGTTTAT TAGCTTCTAA CTAGTGTTGT AAGTCCGATG 1514 

CCAGAATTTG GAGATTTGAG TTCTTCTTTT CATGGCTTTT ATTCACTGTG ACTAATAAGC 1574 

TTCCTAATAA ATCCTTGCCA GACTT 1599 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Ser His Gly His Ser His Gly Met Gly Asp Cys Arg Cys Ala Ala 
1 5 10 15 

Glu Arg Glu Glu Pro Pro Glu Gin His' Ala Met Ala Thr Leu Tyr Leu 
20 25 30 

Arg He Asp Leu Glu Arg Leu Gin Cys Leu Asn Glu Ser Arg Glu Gly 
35 40 45 

Ser Gly Arg Gly Val Phe Lys Pro Trp Glu Glu Arg Thr Asp Arg Ser 
50 55 60 

Lys Phe Val Glu Ser Asp Ala Asp Glu Glu Leu Leu Phe Asn He Pro 
65 7 0 75 80 

Phe Thr Gly Asn Val Lys Leu Lys Gly He He lie Met Gly Glu Asp 
85 90 95 

Asp Asp Ser His Pro Ser Glu Met Arg Leu Tyr Lys Asn He Pro Gin 
100 105 no 

Met Ser Phe Asp Asp Thr Glu Arg Glu Pro Asp Gin Thr Phe Ser Leu 
115 120 125 

Asn Arg Asp Leu Thr Gly Glu Leu Glu Tyr Ala Thr Lys He Ser Arg 
130 135 140 

Phe Ser Asn Val Tyr His Leu Ser He His He Ser Lys Asn Phe Gly 
145 150 155 160 

Ala Asp Thr Thr Lys Val Phe Tyr lie Gly Leu Arg Gly Glu Trp Thr 
165 170 175 

Glu Leu Arg Arg His Glu Val Thr He Cys Asn Tyr Glu Ala Ser Ala 
180 185 190 

Asn Pro Ala Asp His Arg Val His Gin Val Thr Pro. Gin Thr His Phe 

195 200 205 

He Ser 
210 



(2) INFORMATION FOR SEQ ID NO: 31: 



WO 98/24935 



137 



PCT/US97/22105 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CTGGCCTACG GAAGATACGA CAC 



(2) INFORMATION FOR SEQ ID NO : 32: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
ACAATCCGGA GGCATCAGAA ACT 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

AGCCCCGGCC TCCTCGTCCT C 



(2) INFORMATION FOR SEQ ID NO: 34: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



GGCGGCGGCA GCGGTTCTC 



19 
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CLAIMS: 



1 A method for identifying markers for a disease state, comprising the following steps: 

a) providing a first set of peripheral blood mRNAs from one or more subjects known 
to exhibit said disease state and a second set of peripheral blood mRNAs from one or more 
normal subjects; 

b) amplifying both sets of mRNAs to provide nucleic acid amplification products; 

c) comparing said sets of amplification products; and 

d) identifying those mRNAs that are differentially expressed between normal 
subjects and subjects exhibiting said disease state; 

wherein a difference in quantity of expression of an mRNA is indicative of a disease marker. 

2. The method of claim 1 , further defined as comprising the step of using said mRN As as 
templates for DNA synthesis in a reverse transcriptase reaction. 

3. The method of claim 2, wherein random hexamers, arbitrarily chosen oligonucleotides, 
promiscuous oligonucleotide primers, anchoring primers or a combination of these arc used as 
primers in the reverse transcriptase reaction. 

4. The method of claim 1, wherein arbitrarily chosen oligonucleotides, promiscuous 
oligonucleotide primers, anchoring primers or a combination of these are used as primers in the 
amplification step. 

5. The method of claim 1 , wherein the disease state is metastatic or organ confined cancer, 
asthma, lupus erythematosis, rheumatoid arthritis, multiple sclerosis, myasthenia gravis, 
autoimmune thyroiditis, amyotrophic lateral sclerosis, interstitial cystitis or prostatitis. 

6. The method of claim 5, wherein the disease state is metastatic prostate cancer. 

7. The method of claim 5, wherein the disease state is metastatic breast cancer. 
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8. The method of claim 1 , wherein said subjects are laboratory animals. 

9. The method of claim 1, wherein said subjects are humans. 

10. A method of detecting a metastatic cancer disease state in a subject, comprising the steps 
of: 

a) detecting the quantity of expression of a metastatic cancer disease marker 
expressed in peripheral blood of said subject; and 

b) comparing the quantity of expression of said marker in peripheral blood of said 
subject to the quantity of said marker expressed in peripheral blood of one or more normal 
subjects; 

wherein a difference in quantity of expression of said marker in peripheral blood of said subject 
relative to quantity of expression of said marker in peripheral blood of said one or more normal 
individuals is indicative of a metastatic cancer disease state. 

1 1. The method of claim 10, wherein said disease marker is an mRNA. 

12. The method of claim 11, wherein said mRNA is amplified by an RNA polymerase 
reaction. 

13. The method of claim 1 1 , wherein said mRNA is amplified by reverse transcriptase 
polymerase chain reaction or ligasc chain reaction. 

14. The method of claim 10, wherein said detecting is by RNA fingerprinting, branched DNA 
or nuclease protection assay. 

1 5. The method of claim 10, wherein said metastatic cancer disease state is metastatic 
prostate cancer. 
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16. The method of claim 10, wherein said metastatic cancer disease state is metastatic breast 
cancer. 

17. The method of claim 1 1 in which said mRNA comprises one or more of the sequences or 
the complements of the sequences disclosed herein as Genebank Accession numbers D8745 1 , 
T03013, X03558, M28I30, Y00787, SEQ ID NO:K SEQ ID NO:2, SEQ ID NO:3, SEQ ID 
NO:4, SEQ ID NO:5 or SEQ ID NO:29. 

18. The method of claim 10 in which said marker is a product of the interleukin 8 gene. 

19. The method of claim 10, wherein said metastatic cancer disease marker is identified by 
the method of claim 1 . 

20. The method of claim 1 1 , further defined as comprising the steps of 

a) providing primers that selectively amplify at least a portion of said disease state 

marker; 

b) amplifying said disease state marker with said primers to form nucleic acid 
amplification products; 

c) detecting said nucleic acid amplification products; and 

d) measuring the amount of said nucleic acid amplification products formed. 

21 . The method of claim 20 in which said primers are selected to produce an amplicon 
having a sequence of or complementary to a sequence of at least a 50 base contiguous segment of 
Genebank Accession numbers D87451, T03013, X03558, M28130, Y00787, SEQ ID NO:l, 
SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, or SEQ ID NO:29. 

22. The method of claim 2 1 , wherein said amplicon is from about 50 to about 500 bases in 
length. 
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23. The method of claim 2 1 . wherein said amplicon is from about 1 00 to about 4 1 5 bases in 
length. 



24. The method of claim 1 0, wherein said metastatic cancer disease marker is a polypeptide. 

25. The method of claim 24, wherein said polypeptide is encoded by a nucleic acid sequence 
comprising the sequence disclosed herein as Genebank Accession numbers D8745 1 , T0301 3, 
X03558, M28130, Y00787, SEQ ID NO.l, SEQ ID NO:2. SEQ ID NO:3, SEQ ID NO:4. SEQ 
ID NO;5, or SEQ ID NO:29. 

26. The method of claim 24. wherein said detection comprises antibody immunoreaction with 
said polypeptide. 



an 



27. The method of claim 26, wherein said detection comprises an ELISA, 
immunoprecipitation, a radioimmunoassay, an immunohistochemical. Western blotting, dot 
blotting, or FACS analyses. 

28. The method of claim 24, wherein said polypeptide is encoded by the IL-8 gene. 

29. The method of claim 1 0 or claim 24. wherein said marker is a product of the IL-8 gene 
and wherein said comparison is between two alternatively spliced forms of an IL-8 gene product. 

30. The method of claim 24. wherein the quantity of IL-8 polypeptide in peripheral blood is 
measured using an in vitro bioassay that detects at least one IL-8 mediated biological process. 

31. The method of claim 29 wherein said markers comprise Genebank Accession # M283 1 0, 
Y00787, SEQ ID NO:4 and SEQ ID NO:5. 

32. A disease marker for prognosis or diagnosis of a disease condition, wherein said disease 
marker is identified by a process comprising: 
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a) providing a first set of peripheral blood mRNAs from one or more subjects known 
to exhibit said disease state and a second set of peripheral blood mRNAs from one or more 
normal subjects; 

b) amplifying both sets of mRNAs to provide nucleic acid amplification products; 

c) comparing said sets of amplification products; and 

d) identifying those mRNAs that are differentially expressed between normal 
subjects and subjects exhibiting said disease state; 

wherein a difference in quantity of expression of an mRNA is indicative of a disease marker. 

33. The disease marker of claim 33, wherein the disease state is metastatic or organ confined 
cancer, asthma, lupus erythematosis, rheumatoid arthritis, multiple sclerosis, myasthenia gravis, 
autoimmune thyroiditis, amyotrophic lateral sclerosis, interstitial cystitis or prostatitis. 

34. The method of claim 32, wherein the disease state is metastatic prostate cancer. 

35. The method of claim 32, wherein the disease state is metastatic breast cancer. 

36. The method of claim 32, wherein said subjects are laboratory animals. 

37. The method of claim 32, wherein said subjects are humans. 

38. A method of detecting prostate cancer in a biological sample, comprising: 

(a) measuring the levels of IL-8 in combination with at least one prostate disease 
marker in said sample; and 

(b) comparing said levels with corresponding levels obtained from reference 
populations of normal individuals, individuals with BPH and individuals with prostate cancer. 

39. The method of claim 38 in which said prostate disease marker is selected from a group 
consisting of: total prostate specific antigen (PSA); prostate specific membrane antigen 
(PSMA=Folic Acid Hydrolase); prostate acid phosphatase (PAP); prostatic secretory proteins 



WO 98/24935 PCT7US97/22105 

143 

(PSI\, 4 ): human kallekrein 2 (HK2); and the ratio of the concentrations of free and bound forms 
of" PSA (f/t PSA). 



40. The method of claim 38 in which the biological sample comprises peripheral human 
blood. 

4 1 . The method of claim 38 wherein the level of IL-8 in a biological sample is measured 
using at least one antibody that binds to at least one IL-8 gene product. 

42. The method of claim 41 wherein the level of IL-8 gene product bound to antibody is 
measured h\ LLISA. 

43. The melhod of claim 38 wherein the level of IL-8 in a biological sample is measured 
using ai least one oligonucleotide probe that binds to at least one IL-8 messenger RNA (mRNA). 

44. The method of claim 43 wherein the IL-8 mRNA is alternatively spliced to include intron 



45. The method of claim 43 wherein the level of oligonucleotide probe bound to IL-8 mRNA 
is measured by nuclease protection assay. 

46. The method of claim 43 wherein the level of oligonucleotide probe bound to IL-8 mRNA 
is measured by RT-PCR™. 

47. The method of claim 43 wherein the level of oligonucleotide probe bound to IL-8 mRNA 
is measured by ligase chain reaction. 



48. The method of claim 43 wherein the level of oligonucleotide probe bound to IL-8 mRNA 
is measured by PCR™. 
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49. The method of claim 40 wherein the level of IL-8 in a biological sample is measured 
using an in vitro bioassay that detects at least one IL-8 mediated biological process. 

50. The method of claim 44 wherein the level of IL-8 in a biological sample is measured 
using at least one molecule that binds to an IL-8 gene product, wherein said molecule is selected 
from a group consisting of: an IL-8 binding protein; and an IL-8 receptor protein. 

5 1 . The method of claim 48 wherein the level of prostate disease marker in a biological 
sample is measured using at least one antibody that binds to at least one prostate disease marker 
protein. 

52. The method of claim 51 wherein the level of prostate disease marker protein bound to 
antibody is measured by ELISA. 

53. The method of claim 39 wherein the level of prostate disease marker in a biological 
sample is measured using at least one oligonucleotide probe that binds to at least one prostate 
disease marker messenger RNA (mRNA). 

54. The method of claim 43 wherein the level of oligonucleotide probe bound to prostate 
disease marker mRNA is measured by nuclease protection assay. 

55. The method of claim 43 wherein the level of oligonucleotide probe bound to prostate 
disease marker mRNA is measured by RT-PCR™. 

56. The method of claim 43 wherein the level of oligonucleotide probe bound to prostate 
disease marker mRNA is measured by ligase chain reaction. 

57. The method of claim 43 wherein the level of oligonucleotide probe bound to prostate 
disease marker mRNA is measured by PCR™. 
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58. A method of differentially diagnosing prostate cancer and benign prostatic hyperplasia, 
comprising the step of measuring the levels of IL-8 in combination with at least one prostate 
disease marker in a biological sample. 

59. The method of claim 58 in which said prostate disease marker is selected from a group 
consisting of: total prostate specific antigen (PSA),- prostate specific membrane antigen 
(PSMA=Folic Acid Hydrolase), prostate acid phosphatase (PAP), prostatic secretory proteins 
(PSP 94 ), human kallekrein 2 (HK2), and the ratio of the concentrations of free and bound forms 
ofPSA(fAPSA). 

60. The method of claim 59 in which said biological sample consists of peripheral human 
blood. 

61 . A kit for use in detecting a human disease, comprising: 

(a) a pair of primers for amplifying a disease state marker consisting of a nucleic 
acid; and 

(b ) containers for each of said primers. 

62. A kit according to claim 6 1 in which the pair of primers is selected to amplify a nucleic 
acid marker for metastatic human cancer. 

63. A kit according to claim 62 in which the pair of primers is selected to amplify a nucleic 
acid having a sequence comprising at least a 50 base segment of Gcncbank Accession numbers 
D87451, T03013, X03558, M28130, Y00787, SEQ ID NO.l, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:4, SEQ ID NO:5, or SEQ ID NO:29. 

64. A kit according to claim 62, comprising: 

(a) a pair of primers selected to amplify a nucleic acid sequence comprising SEQ ID 
NO:4 or Genebank Accession U Y00787; and 
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(b) a pair of primers selected to amplify a nucleic acid sequence comprising SEQ ID 
NO:5 or Genebank Accession U M28130. 

65. A kit for use in diagnosing metastatic cancer in a biological sample, comprising: 

(a) an antibody which binds with high specificity to a polypeptide having an amino 
acid sequence encoded by a nucleic acid sequence comprising Genebank Accession numbers 
D87451, T03013, X03558, M28130, Y00787, SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:4, SEQ ID NO:5, or SEQ ID NO:29. 

(b) a container for said antibody. 

66. A kit according to claim 65, further defined as comprising: 

(a) an antibody that binds with high specificity to a soluble IL-8 gene product; 

(b) an antibody that binds with high specificity to a membrane bound IL-8 gene 
product; and 

(c) a container for each antibody. 

67. A kit according to claim 65, wherein said metastatic cancer is metastatic prostate cancer. 

68. A kit according to claim 65, wherein said metastatic cancer is metastatic breast cancer. 

69. A kit for detecting or differentially diagnosing human prostate cancer, comprising: 

(a) at least one detection agent for measuring the levels of IL-8 in a biological sample; 

(b) at least one detection agent for measuring the levels of at least one prostate disease 
marker in said biological sample; and 

(c) containers for each of said detection agents. 

70. The kit of claim 69 in which said prostate disease marker is selected from a group 
consisting of: total prostate specific antigen (PSA), prostate specific membrane antigen 
(PSMA=Folic Acid Hydrolase), prostate acid phosphatase (PAP), prostatic secretory proteins 
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(PSP 94 ). human kallekrein 2 (HK2), and the ratio of the concentrations of free and bound forms 
of PSA (f/t PSA). 



71 . The kit of claim 70 in which said detection agents are selected from a group consisting 
5 of: polyclonal antibodies; monoclonal antibodies; oligonucleotides: paired oligonucleotides 

designed to bind to opposite strands of a double-stranded DNA molecule; and at least one 
molecule that binds to an IL-8 gene product. 

72. The method of claim 1 6 in which said breast cancer marker is selected from a group 
1 0 consisting of: SEQ ID NO:29 and Genebank Accession U D8745 1 . 
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Relative Quantitative RT-PCR Showing 
DifferentiahExpression of IL-8 (=UC325) 
in peripheral blood of patients with 
Metastatic Prostate Cancer (M) and 
Normal Individuals (N) at different 
PCR cycles (cy) 

r-? 25 cy 28 cy 31 cy 

int. + [ t K . — — - — no 

- N M M N M template 




Two alternatively spliced forms of the IL-8 mRNA are 
Observed. The Upper band (int.+) includes lntron-3 
in the mature mRNA. Int.- lacks intron -3 
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Relative Quantitative RT-PCR showing 
Differential Expression of IL-8 (=UC325) 
in peripheral blood of patients with 
Metastatic Prostate Cancer (1-5) and a 
Pool of Normal Individuals (N) 



intron 3+ 



5 no temp. 



Two alternatively spliced forms of the IL-8 
mRNA are observed (1-5) are different 
Individuals with metstatic prostate cancer 
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Figure 2 

Ability of Total PSA (ng/ml) to Distinguish BPH and 
StagesA,B,& C Pros tate Cancer (n = 142) 

Area Under the Curve: 0.5995 
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Figure 3 

Ability of Corrected Free/Total PSA Ratio to Distinguish 
BPH and Stages A, B, & C Prostate Cancer (n = 142) 
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Figure 4 

Ability of UC325 (pg/m I) to Distinguish BPH and 
Stages A, B, & C Prostate Cancer (n = 142) 

Area Under the Curve: 0.7973 





1. 00 






0.90 






0.80 






0.70 






0.60 










wn 


O.SO 




C 








0.40 












0.3 0 






0.20 






0.1 0 






0.00 

















• 


i 














































































































































































F 


I 


— 1 


- - -r 


i 




-r 











1 - S p e c ific ity 



Figure 5 

Ability of UC 32 5 (pg/m 1) & T-PSA (ng/ml) to Distinguish 
BPH and Stages A, B, & C Prostate Cancer (n = 142) 
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Figure 6 

A b ili ty of UC 32 5 (pg/m I) & f/t PSA Ratio to Distinguish 
BPH and Stages A, B, & C Prostate Cancer (n = 142) 
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